CN111902812A

CN111902812A - Electronic device and control method thereof

Info

Publication number: CN111902812A
Application number: CN201980018825.7A
Authority: CN
Inventors: 金周荣; 李贤优
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2018-03-12
Filing date: 2019-03-12
Publication date: 2020-11-06
Also published as: KR20190118108A; EP3698258A1; EP3698258A4

Abstract

An artificial intelligence system using an Artificial Intelligence (AI) model learned according to at least one of machine learning, neural network, or deep learning algorithm, and an application and method of controlling an electronic device thereof are provided. The method includes acquiring text based on user input, determining a plurality of key terms from the acquired text, acquiring a plurality of first illustrations corresponding to the plurality of key terms, acquiring a second illustration by synthesizing at least two or more of the plurality of first illustrations, and outputting the acquired second illustrations.

Description

Electronic device and control method thereof

Technical Field

The present disclosure relates to an apparatus and method consistent with an electronic apparatus and a control method thereof. More particularly, the present disclosure relates to an electronic device for generating an image associated with text and a control method thereof.

Background

Apparatuses and methods consistent with the present disclosure relate to Artificial Intelligence (AI) systems that simulate functions (e.g., cognition, determination, etc.) of the human brain using machine learning algorithms and applications thereof.

In recent years, artificial intelligence systems that implement human-level intelligence have been used in various fields. Artificial intelligence systems are systems in which machines learn, determine, and become intelligent themselves, unlike existing rule-based intelligence systems. The more artificial intelligence systems are used, the higher the recognition rate is, and the artificial intelligence systems can more accurately understand the tastes of users, and as a result, existing rule-based intelligence systems are increasingly being replaced by artificial intelligence systems based on deep learning.

Artificial intelligence techniques include machine learning (e.g., deep learning) and element technology (elementtechnology) that utilizes machine learning.

Machine learning is an algorithmic technique of classifying/learning characteristics of input data by itself, and element technique is a technique of simulating functions such as cognition, determination, and the like of the human brain by using a machine learning algorithm such as deep learning, and includes technical fields such as language understanding, visual understanding, inference and prediction, knowledge representation, motion control, and the like.

Various fields to which the artificial intelligence technique is applied are as follows. Language understanding is a technique for recognizing, applying, and processing human language/characters and includes natural language processing, machine translation, dialog systems, query response, speech recognition/synthesis, and the like. Visual understanding is a technique for recognizing and processing objects into human vision and includes object recognition, object tracking, image search, human recognition, scene understanding, spatial understanding, image enhancement, and the like. Inference and prediction are techniques for determining and logically inferring and predicting information, and include knowledge/probability-based inference, optimized prediction, preference-based planning, recommendation, and the like. Knowledge representation is a technique of automatically processing human experience information into knowledge data, and includes knowledge construction (data generation/classification), knowledge management (data utilization), and the like. Motion control is a technique for controlling autonomous traveling of a vehicle and movement of a robot, and includes motion control (navigation, collision, traveling), operation control (behavior control), and the like.

Meanwhile, in order to effectively transfer information, books, newspapers, advertisements, presentations, and the like may be created by inserting illustrations and texts together when creating the books, newspapers, advertisements, and presentations. Conventionally, it takes a long time to find a desired illustration because illustrations suitable for texts must be found one by one, and it is also difficult to unify the design of illustrations inserted into one material of the related art.

The above information is provided merely as background information to aid in understanding the present disclosure. No determination has been made as to whether any of the above-described can be applied as prior art to the present disclosure, nor has any assertion been made.

Disclosure of Invention

Technical problem

Aspects of the present disclosure are to address at least the above problems and/or disadvantages and to provide at least the advantages described below. Accordingly, an aspect of the present disclosure is to provide an electronic device that generates an image associated with text using an Artificial Intelligence (AI) model and a control method thereof.

Additional aspects will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the presented embodiments.

Technical scheme

According to an aspect of the present disclosure, a method of controlling an electronic device is provided. The method includes acquiring text based on user input, determining a plurality of key terms (key terms) from the acquired text, acquiring a plurality of first illustrations corresponding to the plurality of key terms, acquiring a second illustration by synthesizing at least two or more of the plurality of first illustrations, and outputting the acquired second illustrations.

According to an aspect of the present disclosure, an electronic device is provided. The electronic device includes a memory configured to store one or more instructions, and at least one processor coupled to the memory, wherein the at least one processor is configured to execute the one or more instructions to retrieve text based on user input, determine a plurality of key terms from the retrieved text, retrieve a plurality of first illustrations corresponding to the plurality of key terms, retrieve a second illustration by synthesizing at least two or more of the plurality of first illustrations, and output the retrieved second illustration.

Other aspects, advantages and salient features of the invention will become apparent to those skilled in the art from the following detailed description, which, taken in conjunction with the annexed drawings, discloses various embodiments of the invention.

Drawings

The above and other aspects, features and advantages of certain embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings, in which:

fig. 1 is a diagram for describing an illustration providing method according to an embodiment of the present disclosure;

fig. 2A is a flowchart for describing a control method of an electronic device according to an embodiment of the present disclosure;

fig. 2B is a flowchart for describing a control method of an electronic apparatus according to an embodiment of the present disclosure;

fig. 3 is a diagram illustrating an example of a learning method by a Generative Adaptive Network (GAN) according to an embodiment of the present disclosure;

FIG. 4 is a diagram for describing an illustration search method according to an embodiment of the present disclosure using a database including illustrations that match tag information according to an embodiment of the present disclosure;

FIGS. 5, 6, 7, and 8 are diagrams useful in describing embodiments of the present disclosure for obtaining composite insets in which multiple insets are synthesized in accordance with various embodiments of the present disclosure;

FIGS. 9, 10, and 11 are diagrams useful in describing embodiments of the present disclosure that provide multiple composite illustrations synthesized in various combinations according to various embodiments of the present disclosure;

FIG. 12 is a diagram depicting an embodiment of the present disclosure for capturing illustrations associated with text and corresponding to a design of a presentation image, in accordance with an embodiment of the present disclosure;

FIG. 13, FIG. 14, FIG. 15, and FIG. 16 are diagrams useful in describing user interfaces for providing illustrations in accordance with various embodiments of the present disclosure;

FIGS. 17 and 18A are diagrams depicting different embodiments of the present disclosure in which an illustration generation function is applied to messenger programs in accordance with different embodiments of the present disclosure;

fig. 18B is a diagram for describing an embodiment of the present disclosure in which an illustration generation function is applied to a keyboard program according to an embodiment of the present disclosure;

fig. 19 is a block diagram for describing a configuration of an electronic apparatus according to an embodiment of the present disclosure;

FIG. 20A is a flow diagram of a network system using a recognition model according to various embodiments of the present disclosure;

FIG. 20B is a flow diagram of a network system using artificial intelligence models in accordance with an embodiment of the present disclosure;

fig. 20C is a configuration diagram of a network system according to an embodiment of the present disclosure;

FIG. 21 is a block diagram depicting an electronic device for learning and using recognition models in accordance with an embodiment of the present disclosure; and

fig. 22 and 23 are block diagrams for describing a learner and an analyzer according to various embodiments of the present disclosure.

Throughout the drawings, it should be noted that the same reference numerals are used to describe the same or similar elements, features and structures.

Detailed Description

The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of various embodiments of the disclosure as defined by the claims and their equivalents. It includes various specific details that are helpful for understanding, but these specific details are merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the various embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.

The terms and words used in the following description and claims are not limited to bibliographic meanings, but are used only by the inventors to enable a clear and consistent understanding of the disclosure. Therefore, it will be apparent to those skilled in the art that the following descriptions of the various embodiments of the present disclosure are provided for illustration only and not for the purpose of limiting the disclosure as defined by the appended claims and their equivalents.

It should be understood that the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a component surface" includes reference to one or more such surfaces.

In the present disclosure, the expressions "have", "may have", "include", "may include", etc. indicate the presence of corresponding features (e.g., values, functions, operations, components such as components, etc.), and do not exclude the presence of additional features.

In the present disclosure, the expressions "a or B", "at least one of a and/or B", "one or more of a and/or B", etc. may include all possible combinations of the items listed together. For example, "a or B," "at least one of a and B," or "at least one of a or B" may represent all of the following: 1) a case including at least one A, 2) a case including at least one B, or 3) a case including both at least one A and at least one B.

The expressions "first", "second", and the like, as used in the present disclosure may indicate various components, regardless of the order and/or importance of the components, and will only be used to distinguish one component from other components, and do not limit the corresponding components. For example, the first user equipment and the second user equipment may indicate different user equipments regardless of their order or importance. For example, a first component may be named a second component, and a second component may also be similarly named a first component, without departing from the scope of the present disclosure.

The terms "module," "unit," "component," and the like as used in the present disclosure refer to a component that performs at least one function or operation, and such component may be implemented in hardware or software, or may be implemented in a combination of hardware and software. Furthermore, a plurality of "modules", "units", "components", etc. may be integrated into at least one module or chip and may be implemented in at least one processor, except where they need each be implemented in separate specific hardware.

When referring to any component (e.g., a first component) being (operatively or communicatively) coupled/(operatively or communicatively) to/connected to another component (e.g., a second component) or to another component (e.g., a second component), it is understood that any component is directly coupled/directly coupled to another component or may be coupled/coupled to another component through another component (e.g., a third component). On the other hand, when any component (e.g., a first component) is referred to as being "directly coupled" or "directly connected" to another component (e.g., a second component), it is to be understood that the other component (e.g., a third component) is not between the any component and the other component.

As used in this disclosure, the expression "configured (or arranged)" is "replaceable by the expression" suitable "," capable "," designed "," adapted "," made "or" capable ", as the case may be. The term "configured (or set) to" in hardware may not necessarily mean "designed specifically". Conversely, the expression "a device configured as …" may mean that the device is "capable" of being used with other devices or components. For example, "a processor configured (or arranged) to perform A, B and C" may mean a dedicated processor (e.g., an embedded processor) for performing the respective operations, or a general-purpose processor (e.g., a Central Processing Unit (CPU) or an application processor) that may perform the respective operations by executing one or more software programs stored in a memory device.

The terminology used in the present disclosure may be used for the purpose of describing particular embodiments only and is not intended to limit the scope of other embodiments. Terms used in the specification, including technical and scientific terms, may have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Among the terms used in the present disclosure, terms defined in a general dictionary may be interpreted as meanings identical or similar to those in the context of the related art, and are not interpreted as ideal or excessively formal meanings unless explicitly defined in the present disclosure. In some cases, the terms may not be construed to exclude embodiments of the disclosure even if they are defined in the disclosure.

An electronic apparatus according to various embodiments of the present disclosure may include, for example, at least one of a smart phone, a tablet Personal Computer (PC), a mobile phone, a picture phone, an electronic book reader, a desktop Personal Computer (PC), a laptop Personal Computer (PC), a netbook computer, a workstation, a server, a Personal Digital Assistant (PDA), a Portable Multimedia Player (PMP), a moving picture experts group phase 1or phase 2(moving picture experts group phase 1or moving picture experts group phase 2, MPEG-1 or MPEG-2) audio layer 3 (MP 3) player, a mobile medical device, a camera, or a wearable device. According to various embodiments, the wearable device may include at least one of an accessory-type wearable device (e.g., a watch, ring, bracelet, foot chain, necklace, glasses, contact-glasses, or head-mounted-device (HMD), a textile or garment-integrated wearable device (e.g., an electronic garment), a body-attached wearable device (e.g., a skin pad or tattoo), or a living-implanted wearable device (e.g., an implantable circuit).

In some embodiments, the electronic device may be a household appliance. The home appliance may include at least one of, for example, a Television (TV), a Digital Video Disc (DVD) player, audio, a refrigerator, an air conditioner, a cleaner, an oven, a microwave oven, a washing machine, an air purifier, a set-top box, a home automation control panel, a security control panel, a TV box (e.g., HomeSync of samsung electronics ltd, TVTM of apple or TVTM of google), a game console (e.g., xbox, playstation), an electronic dictionary, an electronic key, a camcorder.

In other embodiments, the electronic device may include various medical devices (e.g., various portable medical measurement devices such as a blood glucose meter, a heart rate meter, a sphygmomanometer, a thermometer, etc.), Magnetic Resonance Angiography (MRA), Magnetic Resonance Imaging (MRI), Computed Tomography (CT), photography, ultrasound, etc.), navigation devices, Global Navigation Satellite System (GNSS), Event Data Recorders (EDR), Flight Data Recorders (FDR), car infotainment devices, marine electronic devices (e.g., navigation devices, gyrocompasses, etc.), avionic devices, security devices, head units, industrial or home robots, automated teller machines of financial institutions, point of sale (sales of stores, POS) or internet of things (IoT) devices (e.g., light bulbs, various sensors, electricity or gas meters, sprinkler systems, fire alarms, thermostats, street lights, toasters, exercise equipment, hot water tanks, heaters, boilers, etc.).

According to some embodiments, the electronic device may include at least one of furniture or a part of a building/structure, an electronic board, an electronic signature receiving device, a projector, or various meters (e.g., water, electricity, gas, wave meter, etc.). In different embodiments, the electronic device may be a combination of one or more of the various devices described above. An electronic device according to some embodiments may be a flexible electronic device. Furthermore, the electronic device according to the embodiment of the present disclosure is not limited to the above-described device, but may include a new electronic device according to technical development.

Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

Fig. 1 is a diagram for describing an illustration providing method according to an embodiment of the present disclosure.

With reference to FIG. 1, e.g.The fruit user can use the method through the Microsoft PowerPoint^TMThe presentation program of (a) inputs text 10 as a presentation script, an illustration 20 corresponding to the text 10 may be provided. In accordance with the present disclosure, artificial intelligence techniques may be used to detect the meaning of text 10 and illustrations 20 may be provided that match the meaning of text 10.

Such illustration providing functionality may be provided as a plug-in or add-on (plug-in and add-on) to the presentation software, such as microsoft powerpoint, keynote, etc., or may be provided as separate software.

The illustration providing function according to the present disclosure may be applied not only to presentation materials but also to any field using images suitable for texts such as books, newspapers, advertisements, magazines, electronic postcards, e-mails, instant messenger, and the like.

The term "illustration" used in the present disclosure may also be referred to as terms such as pictograms, plane icons (flatforms), international system of graphical image education (ISOTYPE), information charts, images (video or still images), pictures, emoticons, and the like.

The illustrations used in the present disclosure may be created directly by a subject (subject) who provides a service, or may be externally collected illustrations. In case of collecting the illustrations externally, the main body providing the service should collect only the illustrations that solve the copyright problem and use the illustrations for the service. If the artwork containing the copyright is used to provide higher quality artwork, the subject providing the service should solve the copyright problem. In addition, a premium may be charged from the user for the provision of the service.

The illustration providing functionality according to various embodiments of the present disclosure may be implemented by an electronic device. Hereinafter, a control method of an electronic device according to an embodiment of the present disclosure will be described with reference to fig. 2A and 2B.

Fig. 2A and 2B are flowcharts for describing a control method of an electronic apparatus according to an embodiment of the present disclosure.

Referring to fig. 2A, an electronic device according to an embodiment of the present disclosure acquires text based on user input in operation S210. In particular, the electronic device may provide a presentation image, receive text for the presentation image, and retrieve the text for the presentation image based on user input.

Here, the presentation image may be a screen provided by executing presentation software, for example, a screen as shown in fig. 1. The presentation image may be displayed through a display embedded in the electronic apparatus or may be displayed through an external display device connected to the electronic apparatus. In addition, text used to present images may also be referred to as scripts, announcements, and the like.

For example, as shown in fig. 1, text 10 may be input to a text input window provided on a screen displaying a presentation image. The electronic apparatus may receive text for presenting an image through an input device. The input devices may include, for example, a keyboard, touchpad, mouse, buttons, and the like. The input device may be an external input device embedded in the electronic apparatus or connected to the electronic apparatus.

Meanwhile, the user input may be a voice input according to a user utterance. Specifically, the electronic device may acquire the utterance information of the user by receiving a voice input of the user and analyzing the received voice input, and acquire a text corresponding to the acquired utterance information of the user.

When acquiring the text, the electronic device determines (or identifies) a plurality of key terms (or keywords) from the acquired text in operation S220. In addition, if a plurality of key terms are determined, the electronic device acquires a plurality of first illustrations corresponding to the plurality of key terms in operation S230.

Specifically, the electronic device may input information and text for a design of a presentation image to a first artificial intelligence model learned through an artificial intelligence algorithm, thereby acquiring a plurality of first illustrations associated with the text and corresponding to the design of the presentation image. For example, referring to FIG. 1, the text 10 "great team collaboration leads to success" may be entered into an artificial intelligence model learned through an artificial intelligence algorithm to obtain the illustration 20 associated with the text 10.

Meanwhile, according to an embodiment of the present disclosure, the artificial intelligence model may be learned through a generative countermeasure network (GAN). GAN technology is a key concept in which generative and discriminative models oppose each other and gradually improve each other's performance. Fig. 3 shows an example of a learning method by GAN.

Fig. 3 is a diagram illustrating an example of a learning method by a generative countermeasure network (GAN) according to an embodiment of the present disclosure.

Referring to fig. 3, the generative model 310 generates any image (pseudo image) from random noise, and the discriminative model 320 discriminates a real image (or learned data) and a pseudo image generated by the generative model from each other. The generative model 310 is learned in a direction in which the discriminative model 320 gradually fails to discriminate between the real image and the pseudo image, and on the other hand, the discriminative model 320 is learned in a direction in which the real image and the pseudo image are better discriminated. As learning progresses, the generative model 310 may generate a pseudo image that is substantially similar to a real image. In operation S230, the generative model 310 learned as described above may be used as an artificial intelligence model.

According to another embodiment of the present disclosure, at least one keyword may be acquired from the text using an artificial intelligence model, and an illustration corresponding to the at least one acquired keyword may be searched from a pre-stored database in operation S230. For example, artificial intelligence models for natural language processing may be provided, and such artificial intelligence models may be used to perform morphological analysis, key term extraction, meaning detection, and keyword association (e.g., detection of homonyms, background words, key terms, etc.).

Fig. 4 is a diagram for describing an illustration search method according to an embodiment of the present disclosure using a database including illustrations matched with tag information according to an embodiment of the present disclosure.

Referring to fig. 4, a database may be provided in which illustrations are matched with tag information. In this case, for example, if a text "in recent years, the start-up progress relating to artificial intelligence has increased due to breakthrough of artificial intelligence technology" is input, the text may be input into an artificial intelligence model for natural-language processing (NLP) to acquire "artificial intelligence", "start-up", and "increase" as key terms, and an illustration matching label information including the key terms may be searched from a database.

According to an embodiment of the present disclosure, when an entire sentence is input, the sentence may be divided into phrases/paragraphs to sequentially generate and provide illustrations corresponding to the phrases/paragraphs having the main meaning of the entire sentence.

Meanwhile, the electronic device may input text to the first artificial intelligence model to obtain a plurality of first illustrations associated with the text and having a same graphic effect as each other.

If the plurality of first illustrations are acquired, the electronic device acquires a second illustration by synthesizing at least two or more first illustrations among the plurality of first illustrations in operation S240. Specifically, the electronic device may input information regarding a design of the presentation image and the plurality of first illustrations to the learned second artificial intelligence model, thereby acquiring and outputting the second illustrations modified such that at least two or more of the plurality of first illustrations correspond to the design of the presentation image. In other words, according to another embodiment of the present disclosure, the electronic device may input text to the artificial intelligence model, thereby obtaining a plurality of first illustrations, and obtaining a second illustration in which the plurality of first illustrations are synthesized as illustrations associated with the text. That is, a synthesized illustration (synthesized animation) in which several illustrations are synthesized may be provided.

For example, by using an artificial intelligence model, the electronic device can determine a plurality of key terms from the text, obtain a plurality of first illustrations corresponding to the plurality of key terms, and obtain a synthesized second illustration by arranging the plurality of first illustrations according to the context of the plurality of key terms.

Meanwhile, the electronic device may acquire the second illustration by arranging and synthesizing the plurality of first illustrations according to the contexts of the plurality of key terms.

When the second illustration is acquired according to the above-described procedure, the electronic device outputs the acquired second illustration in operation S250. Specifically, when the second illustration is acquired, the electronic device may control the display to display the acquired second illustration and output the acquired second illustration through the display.

Meanwhile, as described above, the electronic device may acquire text based on user input in a state where the presentation image is not displayed. However, the electronic device may also display the presentation image and retrieve text for the presentation image. Hereinafter, an embodiment of acquiring text for a presentation image and acquiring an illustration based on the acquired text will be described again with reference to fig. 2B. Meanwhile, since each step has been described in detail with reference to fig. 2A, a repetitive description will be omitted.

Referring to fig. 2B, first, a presentation image is displayed in operation S210-1. The presentation image is an image provided by executing presentation software, and it may be, for example, a screen as shown in fig. 1.

When the presentation image is displayed, the electronic apparatus receives an input of text for the presentation image in operation S220-1. For example, as shown in fig. 1, the electronic device may receive input of text 10 on a text input window provided on a screen displaying a presentation image.

When text for presenting an image is input, the electronic device acquires at least one illustration associated with the text by inputting the text to an artificial intelligence model learned through an artificial intelligence algorithm in operation S230-1. For example, referring to FIG. 1, the text 10 "great team collaboration leads to success" may be entered into an artificial intelligence model learned by an artificial intelligence algorithm, thereby obtaining the illustration 20 associated with the text 10.

When the at least one illustration associated with the text is acquired, the electronic apparatus displays an illustration selected by the user among the acquired at least one illustration on the presentation image in operation S240-1.

Fig. 5-8 are diagrams for describing embodiments of the present disclosure for obtaining a composite inset in which multiple insets are synthesized according to various embodiments of the present disclosure.

Fig. 5 is a diagram for describing an embodiment of obtaining a synthesized illustration in which a plurality of illustrations are synthesized, fig. 6 is a diagram for describing an embodiment of obtaining a synthesized illustration in which a plurality of illustrations are synthesized, fig. 7 is a diagram for describing an embodiment of obtaining a synthesized illustration in which a plurality of illustrations are synthesized, and fig. 8 is a diagram for describing an embodiment of obtaining a synthesized illustration in which a plurality of illustrations are synthesized.

Referring to fig. 5, for example, if the text "in recent years, the startup schedule related to artificial intelligence has increased due to breakthrough of artificial intelligence technology", an artificial intelligence model may be used to acquire "artificial intelligence", "startup", "breakthrough", and "increase" as key terms and determine the association between these key terms. The association may be calculated as a numerical value (percentage) of the degree of association of the respective words.

Referring to fig. 6, the context of the acquired key terms may be determined. The process of determining the context includes a process of determining the role of each key term in the sentence, for example, whether each key term is a word corresponding to the background, a word corresponding to the phenomenon/result, or a word corresponding to the center of the sentence.

Referring to fig. 7, a plurality of illustrations corresponding to the acquired key terms may be acquired. Multiple illustrations can be categorized according to the association and context of key terms. For example, at least one illustration corresponding to a key term corresponding to the background and at least one illustration corresponding to a key term corresponding to a phenomenon/result may be classified.

Referring to fig. 8, a plurality of illustrations can be arranged and synthesized according to the association and context of key terms. For example, the illustration corresponding to the background word may be arranged behind other illustrations and may be set to have a higher transparency than other illustrations. Further, the illustrations corresponding to the center word and the word representing the phenomenon/result may be set to have lower transparency than other illustrations, and may be represented by thick lines. The user may use the composite illustration as shown in fig. 8 as it is, or may generate a new composite illustration by individually modifying a plurality of illustrations in the composite illustration (modifying the size, graphic effect, layout position, etc.) as needed.

Fig. 9-11 are diagrams useful in describing embodiments of the present disclosure that provide multiple synthetic illustrations synthesized according to various embodiments of the present disclosure.

According to embodiments of the present disclosure, multiple synthetic illustrations synthesized in various combinations may be provided. This embodiment will be described with reference to fig. 9 to 11.

Referring to fig. 9, for example, if a text "in recent years, a starting progress related to artificial intelligence has increased due to a breakthrough of artificial intelligence technology", key terms may be extracted using an artificial intelligence model, and a plurality of illustrations corresponding to each key term may be acquired. For example, as shown in fig. 9, an illustration corresponding to the key term "artificial intelligence", an illustration corresponding to the key term "launch", and an illustration corresponding to the key term "add" may be obtained, respectively. Further, referring to fig. 10, the illustrations of each key term may be configured in various combinations. In this case, various combinations may be provided in consideration of the similarity of the type of the illustration and the type of the presentation image, the similarity between the illustrations, and the like, using the artificial intelligence model. Further, referring to fig. 11, by arranging and synthesizing the illustration of each combination based on the context of the key term, various synthesized illustrations can be provided in the form of a recommendation list. In this case, a first database composed of templates of the illustration layout defined according to the association type between words and a second database composed of templates of the illustration layout defined according to the association type between phrases/paragraphs may be used. The illustrations may be arranged by loading templates from a database.

The user may select and use a desired composite illustration from the recommendation list. Alternatively, instead of using the provided composite illustration as it is, the user can also generate a new composite illustration by individually modifying a plurality of illustrations (modifying the size, graphic effect, layout position, etc.) in the composite illustration as needed. Weights may be assigned to the user-selected composite plug-ins, i.e., the user-selected combinations, and may be used to relearn the artificial intelligence model. That is, reinforcement learning techniques may be used.

According to embodiments of the present disclosure, information regarding a design of a presentation image and text may be input to an artificial intelligence model to obtain at least one illustration associated with the text and corresponding to the design of the presentation image. The information regarding the design of the presentation image may include information such as the theme, background style, color, font, graphic effect, brightness, contrast, transparency, etc. of the presentation image, or the capture screen of the current presentation image as a whole.

In this case, the artificial intelligence model can include a first artificial intelligence model that generates a base form of the illustration and a second artificial intelligence model that modifies the base form of the illustration to correspond to the design of the presentation image. The basic form of the illustration may include a color, a form in which no design effect is applied, a line-only picture, a black and white picture, and the like. This will be described with reference to fig. 12.

FIG. 12 is a diagram depicting an embodiment of the present disclosure for capturing illustrations associated with text and corresponding to a design of a presentation image, in accordance with an embodiment of the present disclosure.

Referring to fig. 12, a first artificial intelligence model 1210 is a model that generates illustrations corresponding to texts, and is a model that learns using texts and images as learning data. The second artificial intelligence model 1220 is a model that modifies an image to correspond to the design of a presentation image, and is a model that is learned using information about the presentation image and text as learning data. The information on the design of the presentation image may be information on the theme, background style, color, font, graphic effect, brightness, contrast, transparency, etc. of the presentation image.

The second artificial intelligence model 1220 can modify the input image according to the design of the presentation image in relation to the subject matter of the input image, line style, line thickness, color, size, graphic effect, brightness, contrast, shape, layout, composition, etc. For example, the second artificial intelligence model 1220 may list colors used in the design of the presentation image, calculate color theme information of the presentation image by using the frequency, area, etc. of the colors as weights, and color the illustration using the colors in the calculated color theme. Alternatively, the second artificial intelligence model 1220 may define the style of the presentation image according to design elements such as line style, line thickness, curve frequency, edge processing, etc., used in the design of the presentation image, in addition to color information, and may use this information to change the graphic effect of the illustration.

The second artificial intelligence model can bring dynamic motion or sound effects to the illustration. There may be motion in a certain part of the illustration, such as rotation, blinking, shaking, repetition of increasing or decreasing in a certain size, etc., and when the illustration appears, an effect sound or short music that appropriately matches the illustration may be provided with the illustration.

According to an embodiment, at least one first illustration 1211 may be obtained by inputting text for presenting an image to the first artificial intelligence model 1210. Because the first artificial intelligence model 1210 can perform natural language processing, the first artificial intelligence model 1210 can extract key terms from the text and detect the meaning and association of each key term. The form of the first illustration 1211 may be generated according to the meanings of the key terms, the first illustration 1211 may be formed by arranging and synthesizing a plurality of illustrations according to the associations between the key terms and the meanings of the context, and the size, position, transparency, and the like of the plurality of illustrations may be determined according to the importance of the key terms according to whether the key words are background words, main words, or subwords (sub-words).

In addition, at least one first illustration 1211 modified into at least one second illustration 1221 corresponding to the design of the presentation image may be acquired by inputting information about the design of the presentation image and the at least one first illustration 1211 to the second artificial intelligence model 1220.

Since the design of the presentation image may be different for each slide, an illustration may be generated that matches the design of the current slide.

According to another embodiment, it may be determined that the design of the new illustration matches the design of the existing generated illustration, even if there is no information about the design of the presentation image.

According to the embodiments of the present disclosure, when a design is changed by a user editing the design of the entire presentation image after applying an illustration acquired using an artificial intelligence model to the presentation image, the graphic effect of the illustration may be automatically changed to match the changed design. As another example, when a graphical effect applied to any one of the illustrations of the presentation image is changed by the user, the graphical effects of the other illustrations may be automatically changed in the same manner as the modified graphical effect.

According to the above-described embodiment, since the illustration that matches the design of the presentation image and has a sense of unity can be acquired, the user can create the presentation material with a higher degree of completeness in the sense of design.

At the same time, it is also important to match the design of the illustrations with the presentation image, but the illustrations also need to match each other. According to another embodiment of the present disclosure, the illustrations may be generated such that the designs between the illustrations are similar to each other. For example, a plurality of illustrations associated with text and having the same graphic effect as each other may be obtained by inputting text for presenting an image to an artificial intelligence model. The graphical effects may include shadow effects, reflection effects, neon effects, stereo effects, three-dimensional rotation effects, and the like.

In this case, designs between illustrations acquired from a sentence/paragraph specified by the user and the sentence/paragraph may be generated similarly to each other, or designs between entire illustrations of the same presentation material may be generated similarly to each other.

Referring again to fig. 2, an illustration selected by the user among one or more illustrations acquired according to the above-described various embodiments is displayed on the presentation image in operation S240.

For example, at least one illustration acquired according to the above-described embodiments may be provided in a certain area in a screen displaying a presentation image, and here, the selected illustration may be displayed on the presentation image. As another example, the captured illustration may be displayed on the presentation image without user selection.

The illustration displayed on the presentation image can be edited by additional user operations.

Fig. 13 is a diagram for describing a user interface for providing illustrations according to different embodiments of the present disclosure, fig. 14 is a diagram for describing a user interface for providing illustrations according to different embodiments of the present disclosure, fig. 15 is a diagram for describing a user interface for providing illustrations according to different embodiments of the present disclosure, and fig. 16 is a diagram for describing a user interface for providing illustrations according to different embodiments of the present disclosure.

Referring to fig. 13, the illustration generation function according to the present disclosure may be included in presentation software. When an illustration creation menu 1310 in a function menu provided by presentation software is selected, a User Interface (UI)1320 for searching for illustrations may be displayed, and when text is entered into a text entry area 1321 provided in the UI 1320 and a search 1323 is selected, a search result 1325 including at least one illustration associated with the text may be provided.

In search results 1325, several illustrations may be listed according to scores evaluated by other users' number of uses, degree of design matching, and the like.

The illustration selected by the user among the illustrations included in the search result 1325 may be displayed on the presentation image 1330. The user may display the illustration on the presentation image 1330 through an operation such as clicking, dragging and dropping, long touching, or the like using an input device such as a mouse or a touch panel.

Fig. 14 is a diagram for describing a method for providing illustrations according to an embodiment of the present disclosure, and fig. 15 is a diagram for describing a method for providing illustrations according to an embodiment of the present disclosure.

Referring to fig. 14, an illustration generation button 1410 may be provided in a script input window 1400 provided on a screen provided by presentation software. When a user enters text into the script input window 1400 and selects the illustration generation button 1410, at least one illustration 1421 associated with the text may be displayed on the presentation image 1420.

Referring to fig. 15, when a user specifies a block of text (e.g., drags the text) to generate an illustration and selects an illustration generation button 1510, at least one illustration 1531 associated with the specified text 1520 may be displayed on the presentation image 1530. According to this embodiment, an illustration can be generated for each specified sentence.

FIG. 16 illustrates a method for providing illustrations in accordance with an embodiment of the present disclosure.

Referring to fig. 16, when a user designates a block of text (e.g., drags the text) to generate an illustration and performs a specific operation on the block-designated text (e.g., presses a right key of a mouse and long-presses a touch pad), a menu 1640 may be displayed, and when the user selects an illustration generating item including the menu 1640, the block-designated text is input to a text input region 1610 of a UI 1600 for searching for an illustration, and thereafter, when the user selects a search 1620, a search result 1630 including at least one illustration associated with the block-designated text may be provided. In the search results 1630, several illustrations may be listed according to scores evaluated by the number of uses of other users, the degree of design matching, and the like.

According to an embodiment of the present disclosure, a plurality of key terms may be extracted from text, priorities of the plurality of key terms may be ranked, and information on the plurality of key terms and the priorities may be defined as a key term vector by inputting the text to an artificial intelligence model that performs natural language processing, and the information may be input to the artificial intelligence model that is learned to generate an illustration, thereby generating a form of the illustration.

Specifically, if a user inputs a sentence, the sentence is input into an artificial intelligence model that performs natural language processing, and the sentence is divided into phrase/paragraph units to detect the meaning of the corresponding phrase/paragraph. In addition, relationships between respective phrases/paragraphs (background/phenomenon, cause/effect, contrast, assertion and evidence, etc.) are defined. Further, the words of the sentences in the respective phrases/paragraphs are recognized. Further, each word is prioritized separately in the meaning of the phrase/paragraph that contains each word. Further, by aligning the words of each phrase/paragraph according to priority, only N (e.g., two) words with large priorities are used to generate the illustration, and the remaining words may be ignored.

Further, the association between N main words that are prioritized in the phrase/paragraph (subject and predicate, predicate and object, subject, predicate and object, and the like) and the degree of connection between words are defined. For example, if the sentence "increasingly large launch challenge" is input, core words such as "grow (1)", "launch (2)", "challenge (3)", and "increasingly (4)" may be extracted and prioritized.

In this case, the core word may form an illustration from a small range of concepts. For example, a first database may be composed of templates of the illustration layout defined according to the association type between words, and a second database may be composed of templates of the illustration layout defined according to the association type between phrases/paragraphs. Further, the insets that match the meaning of each word may be searched, and the insets that match with the highest probability may be selected. Templates are loaded and prepared internally by word association and the insets that match from each word are inserted into the templates to generate the primary insets. Further, the primary artwork is inserted into a template loaded from the second database to generate a secondary artwork, according to the associations between phrases/paragraphs generated by the primary artwork. The second illustration as described above may be defined in a basic form.

Further, using the basic form of the illustration and the design of the current presentation image, the graphical effect on the basic form of the illustration is automatically changed. For example, colors used in the design of the current presentation image may be listed, color theme information of the current presentation image by using the frequency, area, etc. of the colors as weights may be calculated, and the illustration of the basic form may be colored using the colors in the calculated color theme. Alternatively, in addition to color information, the design of the current presentation image may be defined according to design elements such as line patterns, line thicknesses, curve frequencies, edge processing, and the like used in the design of the presentation image, and the graphic effect of the illustration may be changed using the information.

The user can post edit the illustrations generated as described above. Additionally, as the design of the presentation image changes, the illustrations may also be regenerated to match the change in the design of the presentation image. The user may select each template or main illustration search and the user's selections may be scored to perform reinforcement learning of the artificial intelligence model. In template or pictorial searches using reinforcement learning concepts, results of user or personal user preferences may be gradually learned and displayed.

Also, various embodiments of the present disclosure may be applied to messenger programs. Fig. 17 is a diagram for describing an embodiment of applying the illustration generating function according to the present disclosure to a messenger program, and fig. 18A is a diagram for describing an embodiment of applying the illustration generating function according to the present disclosure to a messenger program.

Referring to fig. 17, when a user selects an emoticon button provided in a messenger UI of a messenger program and inputs text for an emoticon to be generated, at least one emoticon associated with the input text may be generated and displayed. The user may select a desired emoticon among the generated emoticons and transmit the selected emoticon to the conversation partner. Further, the user can generate illustrations that match not only emoticons but also text and send them to another party.

Referring to fig. 18A, when a user selects a specific button (envelope shape) provided in a messenger UI of a messenger program, a background image matching input text may be generated. In addition, the text input to the text window may be inserted into the background image. The position of the text may be changed by a user operation such as touch and drag. As described above, a message in the form of an image in which text is inserted into a background image may be sent to a conversation partner.

Meanwhile, different embodiments of the present disclosure may also be applied to a keyboard program. Fig. 18B is a diagram for describing an embodiment of the present disclosure in which the illustration generation function is applied to a keyboard program.

Referring to fig. 18B, when the user selects an illustration generation button 1810 provided in the UI of the keyboard program and inputs text for an illustration to be generated, at least one illustration associated with the input text may be generated and displayed. The process of generating the illustrations may be performed by the methods described with reference to fig. 1-12. The keyboard program may operate with various other programs. For example, the keyboard program may operate in conjunction with a web browser program, a document creation program, a chat program, a messenger program, and so forth. That is, illustration information associated with text entered into a keyboard program may be obtained and transferred to a web browser program, a document creation program, a chat program, or a messenger program.

Fig. 19 is a block diagram for describing a configuration of the electronic apparatus 100 according to an embodiment of the present disclosure. The electronic device 100 is a device capable of performing all or some of the operations of the embodiments described above with reference to fig. 1-18A.

Referring to fig. 19, the electronic device 100 includes a memory 110 and a processor 120.

For example, the memory 110 may include an internal memory or an external memory. The internal memory may include, for example, at least one of volatile memory (e.g., Dynamic Random Access Memory (DRAM), static ram (sram), synchronous dynamic ram (sdram), etc.), non-volatile memory (e.g., one-time programmable read only memory (OTPROM), programmable ROM (prom), erasable programmable ROM (eprom), electrically erasable programmable ROM (eeprom), mask ROM, flash ROM, etc.), flash memory (e.g., NAND flash memory, NOR flash memory, etc.), a hard disk drive, or a Solid State Drive (SSD).

The external memory may include a flash memory drive such as Compact Flash (CF), Secure Digital (SD), Micro secure digital (Micro-SD), Mini secure digital (Mini-SD), extreme digital (xD), multimedia card (MMC), memory stick, and the like. The external memory may be functionally and/or physically connected to the electronic device 100 through various interfaces.

The memory 110 is accessed by the processor 120, and reading, writing, correcting, deleting, updating, and the like of data in the memory 110 may be performed by the processor 120. In the present disclosure, the term "memory" includes a Read Only Memory (ROM), a Random Access Memory (RAM), or a memory card (e.g., a micro Secure Digital (SD) card or a memory stick) installed in the memory 110, the processor 120 in the electronic device 100.

The memory 110 may store computer-executable instructions for performing a control method according to the embodiments described above with reference to fig. 1 to 18A.

The memory 110 may store presentation software, messenger software, and the like.

The memory 110 may store artificial intelligence models according to the embodiments described above with reference to fig. 1-18A. The artificial intelligence model can be learned by an external server and can be provided to the electronic device 100. The electronic device 100 may download the artificial intelligence model from an external server and store the artificial intelligence model in the memory 110, and when the artificial intelligence model is updated (or relearned), may receive and store the updated artificial intelligence model from the external server. The electronic device 100 may be connected to an external server through a Local Area Network (LAN), the internet, or the like.

The memory 110 may store various databases such as a database composed of an illustration to which tag information is matched, a database composed of a template defining an illustration layout form according to association of words in a sentence, a database composed of a template defining an illustration layout form according to association between phrases/paragraphs of a sentence, and the like.

According to an embodiment, the memory 110 may also be implemented as an external server of the electronic device 100, such as a cloud server.

The processor 120 is a component for controlling the overall operation of the electronic device 100. The processor 120 may be implemented by, for example, a Central Processing Unit (CPU), an Application Specific Integrated Chip (ASIC), a system on a chip (SoC), a microcomputer (MICOM), or the like. The processor 120 may drive an Operating System (OS) or an application program to control a plurality of hardware or software components connected to the processor 120 and perform various data processing and calculations. According to an embodiment, the processor 120 may also include a Graphics Processing Unit (GPU) and/or an image signal processor.

The processor 120 executes computer-executable instructions stored in the memory 110 to enable the electronic device 100 to perform functions according to all or some of the embodiments described in fig. 1-18A.

The processor 120 may retrieve text based on user input by executing at least one or more instructions stored in the memory 110, determine a plurality of key terms from the retrieved text, retrieve a plurality of first illustrations corresponding to the plurality of key terms, retrieve a second illustration by synthesizing at least two or more of the plurality of first illustrations, and output the retrieved second illustration.

Further, the processor 120 may provide a presentation image, acquire at least one illustration associated with text by inputting the text into an artificial intelligence model learned through an artificial intelligence algorithm when text for the presentation image is input, and provide the illustration selected by the user among the one or more acquired illustrations onto the presentation image.

According to an embodiment, the electronic device 100 may use a personal assistant program as an artificial intelligence specific program (or artificial intelligence agent) to obtain illustrations associated with text. In this case, the personal assistant program is a dedicated program for providing an artificial intelligence based service, and may be executed by the processor 120. The processor 120 may be a general purpose processor or a separate AI specific purpose processor.

According to an embodiment of the present disclosure, the electronic device 100 itself includes a display, and the processor 120 may control the display to display various images. According to another embodiment, the electronic apparatus 100 may be connected to an external display device to output an image signal to the external display device, thereby displaying various images on the external display device. In the latter case, the electronic apparatus 100 may be connected to the external display device by wire or wirelessly. For example, the electronic device 100 may include at least one of: a component input jack; a High Definition Multimedia Interface (HDMI) input port; a USB port; or ports such as red, green, blue (RGB), Digital Visual Interface (DVI), HDMI, Dynamic Programming (DP), and thunderbolt interface (thunderbolt), and may be connected to an external display device through such ports. As another example, the electronic apparatus 100 may be connected to the external display device through a communication method such as wireless fidelity (WiFi), wireless display (WiDi), wireless hd (wihd), Wireless Home Digital Interface (WHDI), wireless display, wireless direct connection, bluetooth (e.g., bluetooth classic), bluetooth low energy, AirPlay, zigbee, and the like.

The display included in the electronic apparatus 100 or the external display device connected to the electronic apparatus 100 may include, for example, a Liquid Crystal Display (LCD), a light emitting diode (LRD) display, an Organic Light Emitting Diode (OLED) display (e.g., an active-matrix organic light-emitting diode (AMOLED), a passive-matrix OLED (PMOLED)), a micro-electro-mechanical systems (MEMS) display, an electronic paper display, or a touch screen.

In the present disclosure, the processor 120 "providing" an image, illustration, icon, or the like includes controlling an internal display of the electronic device 100 to display the image or illustration through the internal display, or outputting an image signal for the image, illustration, or the like to an external display apparatus of the electronic device 100.

According to an embodiment of the present disclosure, the electronic apparatus 100 may itself include an input device, and various user inputs may be received through the input device. The input device may include, for example, a touch pad, a touch screen, a button, a sensor capable of receiving motion input, a camera, or a microphone capable of receiving voice input.

According to another embodiment, the electronic apparatus 100 may be connected to an external input device and receive various user inputs through the external input device. For example, the external input device may include a keyboard, a mouse, a remote controller, and the like. The electronic apparatus 100 may be connected to the external input device in a wireless or wired manner. For example, the electronic apparatus 100 may be connected to an external input device through a cable through a USB port or the like. For another example, the electronic apparatus 100 may be wirelessly connected to the external input device through a communication method such as infrared data association (IrDA), Radio Frequency Identification (RFID), wireless fidelity (WiFi), Wi-Fi direct, bluetooth (e.g., bluetooth classic), bluetooth low energy (low power), zigbee, and the like.

The electronic apparatus 100 may receive various user inputs such as text for generating an illustration and a user input for selecting an illustration through an input device included in the electronic apparatus 100 itself or an external input device.

According to an embodiment, the processor 120 may provide a screen provided with a text input window as shown in fig. 1, and when text is input into the text input window, the processor 120 may input the text into the artificial intelligence model to obtain at least one illustration associated with the text.

According to another embodiment, the processor 120 may provide a screen as shown in fig. 13, and when the illustration generation menu 1310 is selected, the processor 120 may provide a UI 1320 for searching for an illustration. Further, when text is entered into a text entry area 1321 provided in the UI 1320 to search for an illustration and a search 1323 is selected, the processor 120 may enter the text into the artificial intelligence model to provide search results 1325 including at least one illustration associated with the text. Further, processor 120 may provide an illustration selected from search results 1325 to presentation image 1330.

According to another embodiment, the processor 120 may provide a screen as shown in FIG. 14, and when text is entered into the script input window 1400 and the illustration generation button 1410 is selected, the processor 120 may enter the entered text into the artificial intelligence model to provide at least one illustration 1421 associated with the text.

According to another embodiment, the processor 120 may provide a screen as shown in FIG. 15, and upon receiving user input for specifying text and selecting the user input of the illustration generation button 1510, the processor 120 may enter the specified text 1520 into the artificial intelligence model to provide at least one illustration 1531 associated with the text.

According to another embodiment, the processor 120 may specify blocks of text as shown in fig. 16 and provide a menu 1640 when a specific user operation for the block-specified text is input, and the processor 120 may provide a UI 1610 for searching for the block-specified text to be input to an illustration of the text input area 1610 when an illustration generating item included in the menu 1640 is selected, and may input the block-specified text to an artificial intelligence model to provide a search result 1630, the search result 1630 including at least one illustration associated with the text when a search 1620 is selected. Further, processor 120 may provide an illustration selected from search results 1630 to presentation image 1330.

According to an embodiment of the present disclosure, processor 120 may input information regarding the design of the presentation image and the text to the artificial intelligence model to thereby obtain at least one illustration associated with the text and corresponding to the design of the presentation image.

For example, referring to fig. 12, the processor 120 may input text to the first artificial intelligence model 1210 to obtain at least one first illustration 1211, and may input information about the design of the presentation image and the at least one first illustration 1211 to the second artificial intelligence model 1220 to obtain at least one second illustration 1221, the second illustration 1221 modified such that the at least one first illustration 1211 corresponds to the design of the presentation image.

According to an embodiment, the processor 120 may input text to the artificial intelligence model to obtain a plurality of illustrations associated with the text and having the same graphical effect as each other.

According to an embodiment, processor 120 may input text into the artificial intelligence model to obtain a plurality of first illustrations and to obtain a second illustration in which the plurality of first illustrations are synthesized into the illustrations associated with the text. For example, the processor 120 may obtain an illustration in which a plurality of illustrations are synthesized using the artificial intelligence model described with reference to fig. 5-11.

According to an embodiment of the present disclosure, the memory 120 may store a database including illustrations that match the tag information described in fig. 4. In this case, the processor 120 may input text to the artificial intelligence model to retrieve at least one key term from the text, and may search a database stored in the memory 120 for an illustration corresponding to the at least one retrieved key term. The database may also be stored in a server external to the electronic device 100.

According to an embodiment of the present disclosure, processor 120 may relearn the artificial intelligence model by applying feedback data that includes information about an illustration selected by the user among one or more illustrations obtained using the artificial intelligence model.

According to another embodiment of the present disclosure, for example, the processor 120 may input text input to a UI provided by executing a messenger program to the artificial intelligence model to provide emoticons associated with the text as described in fig. 17 and provide a background image as described in fig. 18A.

Meanwhile, although it is described in the above embodiment that only one electronic device 100 is used, the above embodiment may be implemented using several devices. This will be described with reference to fig. 20A, 20B, and 20C.

Fig. 20A is a flow diagram of a network system using an artificial intelligence model according to various embodiments of the disclosure.

Referring to fig. 20A, a network system using an artificial intelligence system may include a first component 2010A and a second component 2020A. For example, the first component 2010a may be an electronic device such as a desktop computer, smart phone, tablet PC, etc., and the second component 2020a may be a server storing an artificial intelligence model, a database, etc. Alternatively, first component 2010a may be a general-purpose processor and second component 2020a may be an artificial intelligence specific processor. Alternatively, the first component 2010a may be at least one application, and the second component 2020a may be an Operating System (OS). That is, second component 2020a is a component that is more integrated, more specialized, has less latency, has dominant performance, or has more resources than first component 2010a, and may be a component that is capable of handling many of the computations needed in generating, updating, or applying a model faster and more efficiently than first component 2010 a.

An interface for transmitting/receiving data between the first component 2010a and the second component 2020a may be defined.

As an example, an Application Program Interface (API) may be defined that has learning data applied as parameter (or intermediate or transition) values to the model. The API can be defined as a set of subroutines or functions that can be invoked for any processing of one protocol (e.g., the protocol defined in the second component 2020 a) among any one protocol (e.g., the protocol defined in the first component 2010 a). That is, an environment may be provided in which an operation of another protocol can be performed in any one protocol through an API.

Referring to fig. 20A, first, in operation S2001a, text may be input to the first component 2010A. Text may be entered into the first component 2010a through various input devices such as a keyboard, touch screen, etc. Alternatively, a voice may be input to the first component 2010a and the voice converted into text. Here, the text may be a script for presenting an image or text input to a text input window of a messenger program.

In addition, the first component 2010a may transmit the input text to the second component 2020a in operation S2003 a. For example, the first component 2010a may be connected to the second component 2020a through a Local Area Network (LAN) or the internet, or may be connected to the second component 2020a through a wireless communication (e.g., wireless communication such as GSM, UMTS, LTE, WiBRO, etc.) method.

The first component 2010a may transmit the input text to the second component 2020a as it is, or may perform natural language processing on the input text and transmit it to the second component 2020 a. In this case, the first component 2010a may store an artificial intelligence model for performing natural language processing.

In operation S2005a, the second component 2020a may enter the received text into an artificial intelligence model to obtain at least one illustration associated with the text. The second component 2020a may store a database including various data required for generating artificial intelligence models and illustrations. The second component 2020a may perform operations using an artificial intelligence model according to the different embodiments described above.

Further, in operation S2007a, the second component 2020a may send the at least one captured illustration to the first component 2010 a. In this case, for example, the second component 2020a may send the at least one captured illustration to the first component 2010a in the form of an image file. As another example, the second component 2020a may transmit information on a storage address (e.g., a URL address) of the at least one acquired plug-in to the first component 2010 a.

In operation S2009a, first component 2010a may provide the illustration received from second component 2020 a. For example, the first component 2010a may display at least one received illustration through a display or external display device included in the first component 2010a itself. The user may select and use the illustration desired to be used from one or more displayed illustrations. For example, the artwork may be used to create a presentation image and may be used as an emoticon, background, etc. to be sent to a conversation partner in a messenger program.

The artificial intelligence model as described above may be a deterministic model based on artificial intelligence algorithm learning, e.g., a neural network based model. The learned artificial intelligence model can be designed to mimic human brain structures on a computer and can include a plurality of network nodes with weights that mimic neurons of a human neural network. The plurality of network nodes may each form a connection relationship such that the neuron mimics synaptic activity of a neuron exchanging signals via the synapse. Furthermore, the learned artificial intelligence model may include, for example, a neural network model or a deep learning model developed from a neural network model. In the deep learning model, multiple network nodes may exchange data according to a convolutional connectivity relationship while being located at different depths (or layers). Examples of the learned artificial intelligence model may include, but are not limited to, a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), a Bidirectional Recurrent Deep Neural Network (BRDNN), and the like.

According to an embodiment, the first component 2010a may use a personal assistant program that is an artificial intelligence specific program (or artificial intelligence agent) to obtain the illustrations associated with the text described above. In this case, the personal assistant program is a dedicated program for providing an artificial intelligence based service, and may be executed by an existing general-purpose processor or a separate AI-dedicated processor.

Specifically, the artificial intelligence agent may be operated (or executed) when a predetermined user input is input (e.g., an icon touch corresponding to the personal assistant chat robot, a user voice including a predetermined word, etc.) or a button included in the first component 2010a is pressed (e.g., a button for executing the artificial intelligence agent). In addition, the artificial intelligence agent can send text to the second component 2020a and can provide at least one illustration received from the second component 2020 a.

Of course, the artificial intelligence agent may also be operated when a predetermined user input is detected on the screen or a button included in the first component 2010a (e.g., a button for executing the artificial intelligence agent) is pressed. Alternatively, the artificial intelligence agent may be in a pre-execution state before a predetermined user input is detected or a button included in the first component 2010a is selected. In this case, the artificial intelligence agent of the first component 2010a may acquire illustrations based on text after detecting a predetermined user input or selecting a button included in the first component 2010 a. Further, the artificial intelligence agent may be in a standby state before a predetermined user input is detected or a button included in the first component 2010a is selected. Here, the standby state is a state in which reception of a predetermined user input is detected to control the start of the operation of the artificial intelligence agent. When a predetermined user input is detected or a button included in the first component 2010a is selected while the artificial intelligence agent is in a standby state, the first component 2010a may operate the artificial intelligence agent and provide an illustration obtained based on text.

As another embodiment of the present disclosure, when the first component 2010a directly acquires at least one illustration associated with a text using the artificial intelligence model, the artificial intelligence agent may control the artificial intelligence model to acquire the at least one illustration associated with the text. In this case, the artificial intelligence agent may perform the operations of the second component 2020a described above.

FIG. 20B is a flow diagram of a network system using artificial intelligence models in accordance with an embodiment of the present disclosure.

Referring to fig. 20B, a network system using an artificial intelligence system may include a first component 2010B, a second component 2020B, and a third component 2030B. For example, the first component 2010b may be an electronic device such as a desktop computer, smart phone, tablet PC, etc., the second component 2020b may be a server running presentation software such as Microsoft PowerPointTM, KeyNoteTM, etc., and the third component 2030b may be a server storing an artificial intelligence model or the like that performs natural language processing.

An interface for transmitting/receiving data between the first, second and

third components

2010b, 2020b and 2030b may be defined.

Referring to fig. 20B, first, in operation S2001B, text may be input to the first component 2010B. The text may be entered into the first component 2010b through various input devices such as a keyboard, a touch screen, and the like. Alternatively, a voice may be input to the first component 2010b and the voice converted into text.

Thereafter, the first component 2010b may transmit the input text to the third component 2030b in operation S2003 b. For example, the first component 2010b may be connected to the third component 2030b through a Local Area Network (LAN) or the internet, or may be connected to the third component 2030b through a wireless communication (e.g., wireless communication such as GSM, UMTS, LTE, WiBRO, etc.) method.

In operation S2005b, the third component 2030b may input the received text to an artificial intelligence model to obtain at least one key term associated with the text and an association between the key terms. In operation S2007b, the third component 2030b may send the key terms and the association between the key terms to the second component 2020 b.

In operation S2009b, the second component 2020b may generate a composite illustration using the received key terms and the associations between the key terms. In operation S2011b, the second component 2020b may transmit the generated composite illustration to the first component 2010 b. For example, the first component 2010b may display at least one received illustration through a display or external display device included in the first component 2010b itself. For example, the artwork may be used to create a presentation image and may be used as an emoticon, background, etc. to be sent to a conversation partner in a messenger program.

Fig. 20C is a configuration diagram of a network system according to an embodiment of the present disclosure.

Referring to fig. 20C, a network system using an artificial intelligence model may include a first component 2010C and a second component 2020C. For example, the first component 2010c may be an electronic device such as a desktop computer, a smart phone, a tablet PC, etc., and the second component 2020c may be a server storing an artificial intelligence model, a database, etc.

The first component 2010c may include an input 2012c and an output 2014 c. The inputter 2012c may input text through an input device. The input devices may include, for example, a keyboard, touchpad, mouse, buttons, and the like. The input device may be embedded in the first component 2010c or may be an external input device connected to the first component 2010 c. The outputter 2014c may output an image through an output device. For example, the outputter 2014c may output the illustration based on information received from the second component 2020c via an output device. The output device may include, for example, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, an Organic Light Emitting Diode (OLED) display, a micro-electro-mechanical system (MEMS) display, an electronic paper display, or a touch screen. The output device may be embedded in the first component 2010c or may be an external output device connected to the first component 2010 c.

The second component 2020c may include a natural language processor 2022c, a database 2026c, and an illustration generator 2024 c.

When inputting received text, the natural language processor 2022c may extract key terms from the text using an artificial intelligence model and detect associations and contexts between the key terms.

Database 2026c may store illustrations that match the label information. For example, the database may be searched for an illustration that matches the tag information including the key term output from the natural language processor 2022.

The illustration generator 2024c may generate a composite illustration by combining multiple illustrations searched from the database 2026c based on the received key terms and associations between key terms.

Although the natural language processor 2022c and the illustration generator 2024c according to an embodiment of the present disclosure are illustrated as being included in one server, this is only one example. For example, the natural language processor 2022c and the illustration generator 2024c may also be included in a separate server, and may also be included in the first component 2010 c.

Fig. 21 is a block diagram illustrating a configuration of an electronic device for learning and using artificial intelligence models according to an embodiment of the present disclosure.

Referring to fig. 21, the electronic device 2100 includes a learner 2110 and a determiner 2120. The electronic device 2100 of fig. 21 may correspond to the electronic device 100 of fig. 19 and the second component 2020A of fig. 20A.

The learner 2110 may generate and learn an artificial intelligence model having a reference for acquiring at least one image (illustration, emoticon, etc.) associated with text using the learning data. The learner 2110 may generate an artificial intelligence model with deterministic references using the collected learning data.

As an example, the learner 2110 may generate, learn, or relearn an artificial intelligence model to acquire an image associated with text by using the text and the image as learning data. Further, the learner 2110 may generate, learn, or relearn artificial intelligence models for modifying images to correspond to the design of the presentation by using information about the images and the design of the presentation as learning data.

The determiner 2120 may acquire an image associated with the text by using predetermined data as input data of the learned artificial intelligence model.

As an example, the determiner 2120 may obtain an image associated with the text by using the text as input data of the learned artificial intelligence model. As another example, the determiner 2120 may modify the image to correspond to the design of the presentation by using information about the image and the design of the presentation as input data for an artificial intelligence model.

At least a portion of the learner 2110 and at least a portion of the determiner 2120 may be implemented in software modules or manufactured in the form of at least one hardware chip and may be installed in the electronic device 100 or the second component 2020. For example, at least one of the learner 2110 or the determiner 2120 may also be manufactured in the form of a dedicated hardware chip for Artificial Intelligence (AI), or may be manufactured and installed in various electronic devices as part of an existing general-purpose processor (e.g., CPU or application processor) or a graphic dedicated processor (e.g., GPU). In this case, the dedicated hardware chip for artificial intelligence is a dedicated processor dedicated to probability calculation and has higher parallel processing performance than a general-purpose processor of the related art, so it can quickly process a calculation operation in the field of artificial intelligence, such as machine learning. When the learner 2110 and the determiner 2120 are implemented as software modules (or program modules including instructions), the software modules may be stored in a non-transitory computer-readable medium. In this case, the software module may be provided by an Operating System (OS) or may be provided by a predetermined application. Alternatively, some of the software modules may be provided by an Operating System (OS), and the remaining software modules may be provided by a predetermined application.

In this case, the learner 2110 and the determiner 2120 may be installed in one electronic device, or may be installed in separate electronic devices, respectively. Further, the learner 2110 and the determiner 2120 may also provide the model information constructed by the learner 2110 to the determiner 2120 through a wired or wireless line, and the data input to the learner 2110 may also be provided to the learner 2110 as additional learning data.

Fig. 21 and 22 are block diagrams of the learner 2110 and the determiner 2120 according to different embodiments of the present disclosure.

Referring to FIG. 21, the learner 2110, according to some embodiments, may include a learning data obtainer 2110-1 and a model learner 2110-4. In addition, the learner 2110 may optionally further include at least one of a learning data pre-processor 2110-2, a learning data selector 2110-3, or a model evaluator 2110-5.

The learning data acquirer 2110-1 may acquire learning data required for acquiring an artificial intelligence model of at least one image associated with text. As an embodiment of the present disclosure, the learning data acquirer 2110-1 may acquire information on text, images, design of a presentation, and the like as learning data. The learning data may be data collected or tested by the learner 2110 or a manufacturer of the learner 2110.

The model learner 2110-4 may learn an artificial intelligence model in order to have a reference to use the learning data to obtain images associated with text. As an example, model learner 2110-4 may learn an artificial intelligence model through supervised learning using at least a portion of the learning data as a reference to obtain images associated with text. Alternatively, for example, the model learner 2110-4 may learn the artificial intelligence model through unsupervised learning by finding a reference for acquiring an image associated with text using self-learning of learning data without any supervision. For example, the model learner 2110-4 may use generative confrontation network (GAN) techniques to learn artificial intelligence models. Further, for example, the model learner 2110-4 may learn the artificial intelligence model by using reinforcement learning about feedback on whether or not the determination result according to the learning is correct. In addition, model learner 2110-4 may learn artificial intelligence models using learning algorithms including, for example, error back-propagation or gradient descent.

In addition, the model learner 2110-4 may use the input data to learn a selection reference as to which learning data should be used to obtain an image associated with the text.

When there are a plurality of artificial intelligence models constructed in advance, the model learner 2110-4 may determine an artificial intelligence model having a large relationship between the input learning data and the basic learning data as an artificial intelligence model to be learned. In this case, basic learning data may be classified in advance for each type of data, and an artificial intelligence model may be constructed in advance for each type of data. For example, the basic learning data may be pre-classified by various references such as an area where the learning data is generated, a time when the learning data is generated, a size of the learning data, a kind of the learning data, a generator of the learning data, a type of an object in the learning data, and the like.

When an artificial intelligence model is learned, the model learner 2110-4 may store the learned artificial intelligence model. For example, the model learner 2110-4 may store the learned artificial intelligence model in the memory 110 of the electronic device 100 or in the memory of the second component 2020.

Artificial intelligence models that learn from a collection of text and images have learned features in the form of images about the content represented by the text.

Artificial intelligence models that learn from information about the design of presentation images and image sets have learned what features an image has for the design of presentation images.

The learner 2110 may also include a learning data pre-processor 2110-2 and a learning data selector 2110-3 to improve the determination of the artificial intelligence model or to save resources or time required to generate the artificial intelligence model.

Learning data pre-processor 2110-2 may pre-process the acquired data so that the acquired data may be used for learning to acquire images associated with text. The learning data pre-processor 2110-2 may process the acquired data into a predetermined format so that the model learner 2110-4 may acquire an image associated with text using the acquired data. For example, learning data pre-processor 2110-2 may remove from the input text that is not needed when the artificial intelligence model provides a response (e.g., adverbs, exclamations, etc.).

The learning data selector 2110-3 may select data required for learning from data acquired by the learning data acquirer 2110-1 or data preprocessed by the learning data preprocessor 2110-2. The selected learning data may be provided to the model learner 2110-4. The learning data selector 2110-3 may select learning data required for learning among the acquired or preprocessed data according to a predetermined selection reference. Further, the learning data selector 2110-3 may also select learning data according to a predetermined selection reference through learning by the model learner 2110-4.

The learner 2110 may also include a model evaluator 2110-5 to improve the determination of the artificial intelligence model.

The model evaluator 2110-5 may input evaluation data to the artificial intelligence model, and when a determination result output from the evaluation data does not satisfy a predetermined reference, the model evaluator 2110-5 may cause the model learner 2110-4 to learn again. In this case, the evaluation data may be predefined data for evaluating the artificial intelligence model.

For example, the model evaluator 2110-5 may evaluate that the predetermined reference is not satisfied when the number or ratio of evaluation data, of which determination results are incorrect, in the determination results of the learned artificial intelligence model of the evaluation data exceeds a predetermined threshold.

Meanwhile, when there are a plurality of learned artificial intelligence models, the model evaluator 2110-5 may evaluate whether each of the learned artificial intelligence models satisfies a predetermined reference, and determine a model satisfying the predetermined reference as a final artificial intelligence model. In this case, when there are a plurality of models satisfying the predetermined reference, the model evaluator 2110-5 may determine any one of the models or a predetermined number of models previously set in descending order of evaluation scores as the final artificial intelligence model.

Referring to fig. 23, the determiner 2120 according to some embodiments may include an input data acquirer 2120-1 and a determination result provider 2120-4.

Furthermore, the determiner 2120 may optionally further include at least one of an input data preprocessor 2120-2, an input data selector 2120-3, or a model updater 2120-5.

The input data acquirer 2120-1 may acquire data required to acquire at least one image associated with text. The determination result provider 2120-4 may acquire at least one image associated with text by applying the input data acquired by the input data acquirer 2120-1 to the learned artificial intelligence model as an input value. The determination result provider 2120-4 may obtain a determination result by applying data selected by the input data preprocessor 2120-2 or an input data selector 2120-3 to be described later as an input value to the artificial intelligence model.

As an embodiment, the determination result provider 2120-4 may acquire at least one image associated with text by applying the text acquired by the input data acquirer 2120-1 to the learned artificial intelligence model.

The determiner 2120 may further include an input data pre-processor 2120-2 and an input data selector 2120-3 to improve the determination result of the artificial intelligence model or save resources or time for providing the determination result.

The input data preprocessor 2120-2 can preprocess the acquired data so that the acquired data can be used to acquire at least one image associated with text. The input data preprocessor 2120-2 may process the acquired data into a predetermined format so that the determination result provider 2120-4 may acquire at least one image associated with text using the acquired data.

The input data selector 2120-3 may select data required for providing a response from data acquired by the input data acquirer 2120-1 or data preprocessed by the input data preprocessor 2120-2. The selected data may be provided to the determination result provider 2120-4. The input data selector 2120-3 may select some or all of the acquired or preprocessed data according to a predetermined selection reference for responding to the provision. The input data selector 2120-3 may also select data according to a predetermined selection reference through learning by the model learner 2110-4.

The model updater 2120-5 may control the artificial intelligence model to be updated based on evaluation of the determination provided by the determination provider 2120-4. For example, the model updater 2120-5 may request the model learner 2110-4 to additionally learn or update the artificial intelligence model by providing the determination results provided by the determination result provider 2120-4 to the model learner 2110-4. In particular, the model updater 2120-5 may relearn the artificial intelligence model based on the feedback information according to the user input.

According to the various embodiments described above, when creating a document such as a presentation material, a newspaper, a book, or the like, since a script input for a document, an article, an image, an illustration, or the like that matches the content of the book can be immediately generated, the workload of a user to separately search for images, illustrations, or the like can be reduced. Furthermore, since the artificial intelligence model can capture illustrations that are similar in design, a unified material can be created.

Meanwhile, it is apparent to those skilled in the art that the presentation material creation method described in the above embodiments may be applied to any field requiring an image matched with text, such as books, magazines, newspapers, advertisements, web page production, and the like.

The various embodiments described above may be implemented in software, hardware, or a combination thereof. According to a hardware implementation, the embodiments described in the present disclosure may be implemented using at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a processor, a controller, a microcontroller, a microprocessor, or an electronic unit for performing other functions. According to a software implementation, embodiments such as the processes and functions described in this disclosure may be implemented by separate software modules. Each of the software modules may perform one or more functions and operations described in the specification.

Various embodiments of the disclosure may be implemented in software including instructions that may be stored in a machine-readable storage medium readable by a machine (e.g., a computer). A machine is a device that invokes stored instructions from a storage medium and is operable according to the invoked instructions, and may include an electronic device (e.g., electronic device 100) in accordance with the disclosed embodiments. When the instructions are executed by the processor, the processor may use other components to perform the functions corresponding to the instructions, either directly or under processor control. The instructions may include code generated or executed by a compiler or interpreter. The machine-readable storage medium may be provided in the form of a non-transitory storage medium. Here, "temporary" means that the storage medium does not include a signal and is tangible, but does not distinguish whether data is stored in the storage medium semi-permanently or temporarily.

According to an embodiment, methods according to various embodiments disclosed in the present disclosure may be included and provided in a computer program product. The computer program product may be used as a product for conducting transactions between sellers and buyers. The computer program product may be distributed in the form of a storage medium (e.g., a compact disc-read only memory (CD-ROM)) that is readable by a machine or online through an application store (e.g., playstore). In case of online distribution, at least a part of the computer program product may be at least temporarily stored in a storage medium, such as a memory of a server of a manufacturer, a server of an application store or a relay server, or temporarily generated.

Each of the components (e.g., modules or programs) according to various embodiments may include a single entity or multiple entities, and some sub-components of the above-described sub-components may be omitted, or other sub-components may be further included in different embodiments. Alternatively or additionally, some components (e.g., modules or programs) may be integrated into one entity to perform the same or similar functions performed by the various components prior to integration. Operations performed by modules, programs, or other components may, in accordance with different embodiments, be performed in a sequential, parallel, iterative, or heuristic manner, or at least some operations may be performed in a different order or omitted, or other operations may be added.

Although the embodiments of the present disclosure have been shown and described above, the present disclosure is not limited to the specific embodiments described above, but various modifications may be made by those skilled in the art to which the present disclosure pertains without departing from the gist of the present disclosure disclosed in the appended claims. Such modifications should also be understood to fall within the scope and spirit of the present disclosure.

While the disclosure has been shown and described with reference to various embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents.

Claims

1. A method of controlling an electronic device, the method comprising:

obtaining a text based on a user input;

determining a plurality of key terms from the acquired text;

acquiring a plurality of first illustrations corresponding to the plurality of key terms;

obtaining a second illustration by synthesizing at least two or more of the plurality of first illustrations; and

and outputting the acquired second illustration.

2. The method of claim 1, wherein obtaining the text based on user input comprises:

receiving a voice input of a user;

acquiring utterance information of a user by analyzing the received voice input; and

text corresponding to the acquired utterance information of the user is acquired.

3. The method of claim 1, wherein obtaining the text based on user input comprises:

providing a demonstration image; and

text for the presentation image is retrieved based on user input.

4. The method of claim 3, wherein, in acquiring the plurality of first illustrations, a plurality of first illustrations associated with the text and corresponding to the design of the presentation image are acquired by inputting information about the design of the presentation image and the text to the learned first artificial intelligence model.

5. The method of claim 4, wherein, in obtaining the plurality of first illustrations, a plurality of first illustrations associated with the text and having a same graphic effect as each other are obtained by inputting the text to the first artificial intelligence model.

6. The method of claim 3, wherein, in acquiring the second illustration, a second illustration modified such that at least two or more first illustrations correspond to the design of the presentation image is acquired by inputting information about the design of the presentation image and the plurality of first illustrations into the learned second artificial intelligence model.

7. The method of claim 6, wherein, in obtaining the second illustration, the second illustration is obtained by arranging and synthesizing the plurality of first illustrations according to the contexts of the plurality of key terms.

8. The method of claim 4, further comprising:

the first artificial intelligence model is relearned by applying feedback data to the first artificial intelligence model, the feedback data including information about an illustration selected by the user among the acquired plurality of first illustrations.

9. The method of claim 4, wherein the first artificial intelligence model comprises an artificial intelligence model learned through a Generative Antagonistic Network (GAN).

10. The method of claim 1, wherein the plurality of first illustrations corresponding to the plurality of key terms are searched for and retrieved from a pre-stored database when the plurality of first illustrations are retrieved.

11. An electronic device, comprising:

a memory storing one or more instructions; and

at least one processor coupled to the memory,

wherein the at least one processor is configured to execute the one or more instructions to:

the text is retrieved based on the user input,

a plurality of key terms are determined from the retrieved text,

obtaining a plurality of first illustrations corresponding to the plurality of key terms,

obtaining a second illustration by synthesizing at least two or more of the plurality of first illustrations, an

And outputting the acquired second illustration.

12. The electronic device of claim 11, wherein the at least one processor is further configured to:

a voice input of a user is received,

obtaining utterance information of a user by analyzing the received voice input, an

13. The electronic device of claim 11, wherein the at least one processor is further configured to:

providing a presentation image, and

text for the presentation image is retrieved based on user input.

14. The electronic device of claim 13, wherein the at least one processor is further configured to obtain a plurality of first illustrations associated with the text and corresponding to the design of the presentation image by inputting information about the design of the presentation image and the text into a learned first artificial intelligence model.

15. The electronic device of claim 14, wherein the at least one processor is further configured to obtain a plurality of first illustrations associated with the text and having a same graphical effect as one another by inputting the text to the first artificial intelligence model.