TWI698835B - Image processing method and device and computer-readable storage medium - Google Patents
Image processing method and device and computer-readable storage medium Download PDFInfo
- Publication number
- TWI698835B TWI698835B TW108101009A TW108101009A TWI698835B TW I698835 B TWI698835 B TW I698835B TW 108101009 A TW108101009 A TW 108101009A TW 108101009 A TW108101009 A TW 108101009A TW I698835 B TWI698835 B TW I698835B
- Authority
- TW
- Taiwan
- Prior art keywords
- character string
- picture
- item
- string
- user
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
Abstract
本說明書實施例揭示了一種圖片處理方法和裝置,所述方法包括:在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;從所述語音識別出第一字串,作為添加項;以及在所述圖片上添加所述添加項。 The embodiment of this specification discloses a picture processing method and device. The method includes: after the user opens the picture, in response to the user operation, receiving the user's voice; recognizing the first character string from the voice as an additional item; and Add the added item to the picture.
Description
本說明書實施例有關圖像處理領域,更具體地,有關一種圖片處理方法和裝置。 The embodiments of this specification relate to the field of image processing, and more specifically, to an image processing method and device.
隨著網際網路技術的發展,人們越來越多的在社交平台中發布圖片或者向好友發送圖片。例如,在朋友圈發布物品的圖片以推廣該物品。在該情況中,需要在圖片中標注物品的一些特徵,如尺寸、材料、細節、外觀等等。再例如,在朋友圈發布自己的照片。在該情況中,人們可能希望在圖片中標注出自己的心情、感受等。目前的方案是,透過一些圖片編輯軟體手工標注尺寸、材質、心情、感受等資訊。因此,需要一種更有效的圖片處理方法,以方便、快速地在圖片中進行標注打標籤。 With the development of Internet technology, more and more people publish pictures on social platforms or send pictures to friends. For example, post a picture of an item in Moments to promote the item. In this case, some features of the item, such as size, material, details, appearance, etc., need to be marked in the picture. For another example, publish your own photos in Moments. In this case, people may wish to mark their mood, feelings, etc. in the picture. The current plan is to manually mark the size, material, mood, feelings and other information through some image editing software. Therefore, there is a need for a more effective image processing method to conveniently and quickly label and label images.
本說明書實施例旨在提供一種更有效的,以解決現有技術中的不足。 The embodiments of this specification aim to provide a more effective method to solve the deficiencies in the prior art.
為實現上述目的,本說明書一個態樣提供一種圖片處理方法,包括:在用戶打開圖片之後,回應於用戶操作, 接收用戶的語音;從所述語音識別出第一字串,作為添加項;以及在所述圖片上添加所述添加項。 In order to achieve the above objective, one aspect of this specification provides a picture processing method, including: after the user opens the picture, responding to the user's operation, Receiving the user's voice; recognizing the first character string from the voice as an additional item; and adding the additional item to the picture.
本說明書另一態樣提供一種圖片處理方法,包括:在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;從所述語音識別出第一字串;根據預設的關鍵字串庫,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項;以及在所述圖片上分別添加所述至少一個添加項。 Another aspect of this specification provides a picture processing method, which includes: after the user opens the picture, in response to the user's operation, receiving the user's voice; recognizing the first character string from the voice; and according to a preset keyword string library, Obtain at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string as at least one additional item; and respectively add the at least one character string to the picture An addition.
在一個實施例中,在上述圖片處理方法中,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項包括,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的字串,作為添加項。 In one embodiment, in the above-mentioned picture processing method, at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string is acquired as at least one addition The item includes: obtaining a string matching the keyword string in the keyword string library from the first string as an additional item.
在一個實施例中,在上述圖片處理方法中,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項包括,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第三字串,其中,所述第三字串為表示量的單位的字串,並且在所述第一字串中,在所述第三字串之前為數字字串,以及,獲取順序包括所述數字字串和所述第三字串的字串作為添加項。 In one embodiment, in the above-mentioned picture processing method, at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string is acquired as at least one addition The item includes: obtaining a third character string matching a keyword string in the keyword string library from the first character string, wherein the third character string is a character string representing a unit of quantity, and In the first character string, a digital character string precedes the third character string, and a character string including the digital character string and the third character string in an acquisition sequence is used as an additional item.
在一個實施例中,在上述圖片處理方法中,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項包括,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第四字串作為添加項,其中,所述第四字串預設為對應於特定圖形,以及,獲取所述特定圖形作為添加項。 在一個實施例中,在上述圖片處理方法中,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項包括,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第五字串,其中,所述第五字串預設為對應於特定圖形,以及,獲取所述特定圖形作為添加項。 在一個實施例中,上述圖片處理方法還包括,在用戶打開圖片之後,根據用戶選擇的圖片應用場景,獲取預設為與所述場景對應的至少一個圖形作為至少一個添加項,以及在所述圖片上分別添加根據所述場景獲取的至少一個添加項。 在一個實施例中,在上述圖片處理方法中,所述圖片應用場景為商品營銷場景,以及,其中,預設為與所述商品營銷場景對應的至少一個圖形包括:標尺、標籤、圖框和箭頭。 在一個實施例中,在上述圖片處理方法中,所述根據預設的關鍵字串庫包括,根據與用戶選擇的圖片應用場景對應的關鍵字串庫。 在一個實施例中,在上述圖片處理方法中,所述場景為商品營銷場景,以及,其中,與所述場景對應的關鍵字串庫包括關於以下屬性的關鍵字串:材質、尺寸、顏色、價格和外觀。 在一個實施例中,上述圖片處理方法還包括,在接收用戶的語音之前或之後,在螢幕上顯示與所述圖片應用場景對應的語音輸入內容提示。 在一個實施例中,上述圖片處理方法還包括,在圖片中添加所述添加項之後,根據用戶手勢或輸入進行以下至少一種修改:改變所述添加項的位置、改變所述添加項的尺寸、編輯所述添加項的內容、以及刪除所述添加項。 在一個實施例中,在上述圖片處理方法中,所述用戶打開圖片包括,用戶在其終端的相簿中打開圖片、用戶在社交APP中打開圖片、或者用戶在用於執行所述方法的APP中打開圖片。 本說明書另一態樣提供一種圖片處理裝置,包括:接收單元,配置為,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;識別單元,配置為,從所述語音識別出第一字串,作為添加項;以及添加單元,配置為,在所述圖片上添加所述添加項。 本說明書另一態樣提供一種圖片處理裝置,包括:接收單元,配置為,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;識別單元,配置為,從所述語音識別出第一字串;獲取單元,配置為,根據預設的關鍵字串庫,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項;以及添加單元,配置為,在所述圖片上分別添加所述至少一個添加項。 本說明書另一態樣提供一種電腦可讀的儲存媒體,其上儲存有指令碼,所述指令碼在電腦中執行時,令電腦上述圖像處理方法。In one embodiment, in the above-mentioned picture processing method, at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string is acquired as at least one addition The item includes: obtaining a fourth character string matching a keyword string in the keyword string library from the first character string as an additional item, wherein the fourth character string is preset to correspond to a specific graphic, And, acquiring the specific graphic as an added item. In one embodiment, in the above-mentioned picture processing method, at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string is acquired as at least one addition The item includes: obtaining a fifth character string matching a keyword string in the keyword string library from the first character string, wherein the fifth character string is preset to correspond to a specific graphic, and obtaining The specific graphic is used as an additional item. In one embodiment, the above-mentioned picture processing method further includes, after the user opens the picture, according to the picture application scene selected by the user, acquiring at least one graphic preset to correspond to the scene as at least one additional item, and At least one additional item obtained according to the scene is added to the picture respectively. In one embodiment, in the above-mentioned image processing method, the image application scene is a commodity marketing scene, and, wherein, at least one graphic preset to correspond to the commodity marketing scene includes: a ruler, a label, a frame, and arrow. In one embodiment, in the image processing method described above, the preset keyword string library includes a keyword string library corresponding to a picture application scenario selected by a user. In one embodiment, in the above image processing method, the scene is a merchandise marketing scene, and, wherein the keyword string library corresponding to the scene includes keyword strings related to the following attributes: material, size, color, Price and appearance. In one embodiment, the above-mentioned picture processing method further includes, before or after receiving the user's voice, displaying a voice input prompt corresponding to the picture application scene on the screen. In one embodiment, the image processing method described above further includes, after adding the additional item to the picture, performing at least one of the following modifications according to a user gesture or input: changing the position of the additional item, changing the size of the additional item, Edit the content of the added item, and delete the added item. In one embodiment, in the above-mentioned image processing method, the user opening the image includes: the user opens the image in the photo album of the terminal, the user opens the image in the social APP, or the user opens the image in the APP for executing the method. Open the picture in. Another aspect of this specification provides a picture processing device, including: a receiving unit configured to, after the user opens the picture, respond to user operations to receive the user’s voice; the recognition unit is configured to recognize the first voice from the voice The character string is used as an added item; and the adding unit is configured to add the added item on the picture. Another aspect of this specification provides a picture processing device, including: a receiving unit configured to, after the user opens the picture, respond to user operations to receive the user’s voice; the recognition unit is configured to recognize the first voice from the voice A character string; an obtaining unit configured to obtain at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string according to a preset keyword string library , As at least one additional item; and the adding unit, configured to add the at least one additional item to the picture respectively. Another aspect of this specification provides a computer-readable storage medium on which an instruction code is stored, and when the instruction code is executed in a computer, the computer uses the above-mentioned image processing method.
下面將結合圖式描述本說明書實施例。
圖1示意顯示根據本說明書實施例的系統100。如圖1所示,系統100包括顯示單元11、語音接收單元12、語音識別單元13、獲取單元14、關鍵字串庫15以及圖片編輯單元16。首先,用戶透過顯示單元11打開圖片。在打開圖片之後,用戶可透過語音接收單元12的介面觸發語音接收單元12。例如,透過長按螢幕上顯示的麥克風圖示,從而觸發語音接收單元12開始接收語音。在用戶斷開語音接收單元12的介面(例如,鬆開所述麥克風圖示)之後,語音接收單元12將接收到的語音發送給語音識別單元13。語音識別單元13透過語音識別功能將接收的語音識別為字串,該字串可包括文字、數字、字母、符號等。在一個實施例中,語音識別單元13將識別出的字串發送給圖片編輯單元16,從而圖片編輯單元16在圖片上添加所述字串。在另一個實施例中,語音識別單元將識別出的字串發送給獲取單元14,獲取單元14透過呼叫關鍵字串庫15,而將所述字串與所述詞庫中的關鍵字串進行匹配,從而獲取所述字串中的關鍵字串、或對應的字串組合、或對應的圖形作為添加項,並將該添加項發送給圖片編輯單元16。之後,圖片編輯單元16在圖片上添加所述添加項。
圖2顯示根據本說明書實施例的一種圖片處理方法的流程圖。所述方法包括:在步驟S21,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;在步驟S22,從所述語音識別出字串,作為添加項;以及在步驟S23,在所述圖片上添加所述添加項。The embodiments of this specification will be described below in conjunction with the drawings.
Fig. 1 schematically shows a
首先,在步驟S21,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音。這裡,不限定用戶打開圖片的設備,例如,用戶可在便攜式智能設備中打開圖片,或者,用戶可在電腦中打開圖片。當用戶在例如手機中打開圖片時,不限定用戶具體的打開位置。例如,用戶可在帶有根據本說明書實施例的圖片處理功能的手機相簿中打開圖片、可在帶有根據本說明書實施例的圖片處理功能的社交APP(例如朋友圈、生活圈等)中打開圖片,或者可在用於執行根據本說明書實施例的圖片處理方法的APP中打開圖片。 First, in step S21, after the user opens the picture, the user's voice is received in response to the user's operation. Here, the device for the user to open the picture is not limited. For example, the user can open the picture in a portable smart device, or the user can open the picture in a computer. When a user opens a picture in, for example, a mobile phone, the user's specific opening position is not limited. For example, the user can open the picture in the mobile phone photo album with the picture processing function according to the embodiment of this specification, and can open the picture in the social APP (such as the circle of friends, the life circle, etc.) with the picture processing function according to the embodiment of this specification. Open the picture, or open the picture in an APP for executing the picture processing method according to the embodiment of this specification.
用戶在打開圖片之後,可進行用於打開語音接收的介面的操作。例如,在用戶透過電腦打開圖片的情況中,用戶可透過打開麥克風以開始電腦的語音接收。在用戶使用手機打開圖片的情況中,用戶可長按螢幕上的麥克風圖示,以開始手機的語音接收。在一個實施例中,用戶可單點螢幕上的麥克風圖示(該圖示位於圖片外部),然後長按圖片中的特定位置,進行語音輸入。從而可以在圖片中的特定位置插入透過語音識別獲得的標籤。 After opening the picture, the user can perform an operation for opening the voice receiving interface. For example, in the case where the user opens the picture through the computer, the user can start the voice reception of the computer by turning on the microphone. When the user uses the mobile phone to open the picture, the user can long press the microphone icon on the screen to start the voice reception of the mobile phone. In one embodiment, the user can single-click the microphone icon on the screen (the icon is located outside the picture), and then long-press a specific position in the picture to perform voice input. In this way, tags obtained through voice recognition can be inserted at specific positions in the picture.
在步驟S22,從所述語音識別出字串,作為添加項。這裡,可透過已有的語音識別功能進行語音識別。從而從輸入的語音識別出對應的字串。所述對應的字串可包括漢字字元、數字字元、字母字元、或符號字元等。 In step S22, a character string is recognized from the speech as an additional item. Here, voice recognition can be performed through the existing voice recognition function. Thus, the corresponding character string can be recognized from the input voice. The corresponding character string may include Chinese characters, numeric characters, alphabetic characters, or symbol characters.
在步驟S23,在所述圖片上添加所述添加項。即,將上述字串作為文字域添加到圖片中。在一個實施例中,用 戶長按螢幕中的麥克風圖示進行語音輸入,在該情況中,系統將所述添加項隨機添加到圖片中的一個位置。在另一個實施例中,用戶在單點麥克風圖示之後,長按圖片中的特定位置進行語音輸入,在該情況中,系統將所述添加項添加到圖片中的特定位置。 In step S23, the addition item is added to the picture. That is, add the above-mentioned string as a text field to the picture. In one embodiment, use The user long presses the microphone icon on the screen to perform voice input. In this case, the system randomly adds the added item to a position in the picture. In another embodiment, after the single-point microphone icon, the user long presses a specific location in the picture to perform voice input. In this case, the system adds the added item to the specific location in the picture.
在一個實施例中,在用戶打開圖片之後,根據用戶選擇的圖片應用場景,獲取預設為與所述場景對應的至少一個圖形作為至少一個添加項,以及在所述圖片上分別添加根據所述場景獲取的至少一個添加項。例如,當在根據本說明書實施例的圖片處理APP中進行所述圖片處理時,APP可提供多個場景的選擇按鈕。所述多個場景例如包括:商品營銷場景、自拍場景、教學場景、婚介場景等。在該APP中,用戶可在打開圖片之前預先選擇好場景,也可以在打開圖片之後選擇場景。在該APP中,對部分場景預設對應的圖形,例如,對於商品營銷場景,預設對應的圖形包括,標尺、標籤、圖片、箭頭等。從而,在用戶打開圖片之後,在用戶選擇了商品營銷場景的情況下,APP自動獲取對應的圖形標尺、標籤等,並在圖片上自動添加標尺和標籤。本發明所屬技術領域中具有通常知識者可以理解,這裡在APP中打開圖片只是為了示例說明,例如,用戶也可以在手機相簿中打開圖片,並在圖片打開之後選擇圖片應用場景。 In one embodiment, after the user opens the picture, according to the picture application scene selected by the user, at least one graphic preset to correspond to the scene is obtained as at least one addition item, and the picture is added according to the At least one additional item acquired by the scene. For example, when the image processing is performed in the image processing APP according to the embodiment of the present specification, the APP may provide selection buttons for multiple scenes. The multiple scenes include, for example, a commodity marketing scene, a selfie scene, a teaching scene, a matchmaking scene, and the like. In this APP, the user can pre-select the scene before opening the picture, or select the scene after opening the picture. In the APP, preset corresponding graphics for some scenes. For example, for a merchandise marketing scene, preset corresponding graphics include rulers, labels, pictures, arrows, etc. Thus, after the user opens the picture, if the user selects the merchandise marketing scenario, the APP automatically obtains the corresponding graphic ruler, label, etc., and automatically adds the ruler and label to the picture. Those with ordinary knowledge in the technical field to which the present invention pertains can understand that opening the picture in the APP here is just for illustration purposes. For example, the user can also open the picture in the mobile phone photo album and select the picture application scene after the picture is opened.
在一個實施例中,在接收用戶的語音之前或之後,在螢幕上顯示與所述圖片應用場景對應的語音輸入內容提示。In one embodiment, before or after receiving the user's voice, a voice input prompt corresponding to the picture application scenario is displayed on the screen.
在添加了所述添加項之後,用戶可對該添加項進行各種操作。例如,在用戶使用手機的情況中,用戶可以透過手勢,改變所述添加項的位置、改變所述添加項的尺寸,例如透過按著添加項在螢幕上滑動,以將添加項調整到新的位置,透過兩個手指對所述添加項進行旋轉,而調整添加項的角度,透過在添加項的對角線方向滑動兩個手指,從而調整添加項的大小等。另外,用戶可在所述添加項中輸入新的字元或刪除已有的字元,或者,用戶透過長按所述添加項,以顯示更多的操作按鈕,例如,刪除按鈕,從而進行更多的對該添加項的編輯操作。
圖3顯示根據本說明書實施例的一種圖片處理方法的流程圖。所述方法包括:在步驟S31,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;在步驟S32,從所述語音識別出第一字串;在步驟S33,根據預設的關鍵字串庫,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項;以及在步驟S34,在所述圖片上分別添加所述至少一個添加項。
該方法中的步驟S31和S32與圖2中的步驟S21和S22基本相同,在此不再贅述。
在步驟S33,根據預設的關鍵字串庫,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項。
在一個實施例中,所述第二字串為所述第一字串。
所述關鍵字串庫可透過人工整理、或機器學習獲得。其可以包括對應於各個具體場景的關鍵字串。例如,一個具體的場景為商品營銷場景,在該場景中,用戶為了推廣圖片中的物品,需要對物品的各種屬性打上標籤,例如,所述屬性包括材質、尺寸、顏色、價格、外觀等。因此,在對應於商品營銷場景的關鍵字串庫中,可包括關於上述各個屬性的關鍵字串。例如,在材質這類中,可包括“純銅”、“塑料”、“玻璃”等表示材料的關鍵字串,在尺寸這類中,可包括“cm”、“m”、“公分”等表示尺寸單位的關鍵字串,在顏色這類中,可包括“紅色”、“藕荷色”、“洋紅色”等表示顏色的關鍵字串,在價格這類中,可包括“元”、“美元”等表示貨幣單位的關鍵字串,以及,在外觀這類中,可包括“金屬拉絲”、“拋光”等表示外觀的關鍵字串。
再例如,所述場景為婚介場景,在該場景中,用戶為了介紹圖片中的人物,需要給人物打上各種人物屬性標籤。例如,所述屬性包括年齡、專業、工作單位等。則與婚介場景對應的關鍵字串庫中可包括與上述屬性對應的關鍵字串,如年齡單位(歲)、物理、生物、自動化、公司、事務所等等。
再例如,所述場景為自拍場景。在該場景中,用戶可以給自拍圖打上心情、感受標籤等。從而,與該場景對應的關鍵字串庫中可包括“開心、憤怒、焦慮”等關鍵字串。
在一個實施例中,用戶可選擇圖片應用場景。例如,在用戶打開圖片之後,可在螢幕上顯示場景選項按鈕,用戶可透過所述按鈕選擇希望的圖片應用場景,或者,用戶可在打開圖片之前預先選擇好圖片應用場景。在用戶選擇了場景之後,系統根據與該場景對應的預設關鍵字串庫,獲取所述添加項。例如,圖4顯示商品營銷場景的示例。用戶在打開如圖4所示的圖片之後,可選擇“商品營銷場景”。從而,系統在對用戶的語音輸入語音識別為字串之後,呼叫對應於商品營銷場景的關鍵字串庫與所述字串進行匹配。
在一個實施例中,系統在接收用戶選擇的圖片應用場景之後,在接收用戶的語音之前或之後,在螢幕上顯示與所述場景對應的語音輸入內容提示。圖5示意示出在商品營銷場景下,螢幕上的語音輸入內容提示,包括“長120裡面”(尺寸)、“金屬是拉絲拋光純銅材質”(材質)、“春季新款”、“50元拿貨價”(價格)等。可對應於特定的場景預先設定所述語音輸入內容提示。
在一個實施例中,例如用戶在如上所述選擇商品營銷場景之後,用戶透過長按螢幕上的麥克風輸入語音“高30cm,寬35cm,五金材質是純銅五金磨砂,裝飾物為圓頭釘打孔,價格120元”。系統在將該語音識別為字串之後,將該字串與對應於商品營銷場景的關鍵字串庫中的關鍵字串相匹配。在所述關鍵字串庫的關於材質的分類中包括關鍵字串“純銅五金磨砂”、在關於外觀的分類中包括關鍵字串“圓頭釘打孔”,因此,獲取“純銅五金磨砂”和“圓頭釘打孔”作為將要添加到圖片上的添加項。在一個實施例中,在關鍵字串庫中將關於材質和外觀的關鍵字串預設為對應於標籤圖形。從而在獲取添加項“純銅五金磨砂”和“圓頭釘打孔”之後,系統還自動獲取標籤圖形作為添加項。所述標籤圖形用於在圖片中標注出“純銅五金磨砂”材質對應的具體位置,以及“圓頭釘打孔”外觀對應的具體位置。
在一個實施例中,從上述字串可獲取,與所述關鍵字串庫的關於尺寸的分類中的關鍵字串“cm”匹配的“cm”,並且可判斷出在上述字串中,“cm”之前為數字字串,因此獲取字串中的“30cm”和“35cm”作為添加項分別添加到圖片上。在一個實施例中,在關鍵字串庫中將“cm”設定為對應於標尺圖形,從而在獲取添加項“30cm”和“35cm”之後,系統還自動獲取標尺圖形作為添加項。
在一個實施例中,在所述關鍵字串庫的關於價格的分類中包括關鍵字串“元”,從而可從上述字串中獲取關鍵字串“元”。並且可判斷,在上述字串中,“元”的之前為數字字串,因此獲取上述字串中的“120元”作為添加項添加到圖片上。
在一個實施例中,在所述關鍵字串中的關於尺寸的分類中包括關鍵字串“高”和“寬”,而在關鍵字串庫中將“高”設定為對應於標尺圖形。因此,在獲取字串中的關鍵字串“高”和“寬”之後,系統獲取標尺圖形作為添加項。
所述添加的圖形不限於上述標籤和標尺,還可以是箭頭、各種用於圈注的幾何形狀、圖框等等。例如,可將標籤設定為與關鍵字串庫中的顏色、材質等關鍵字串對應,將標尺設定為與關鍵字串中的表示長度或長度單位的字串對應。而在例如自拍場景中,還可以根據關鍵字串匹配,添加與對話內容對應的圖框,與心情對應的表情圖示等。
再回到圖3,在步驟S34,在所述圖片上分別添加所述至少一個添加項。圖6顯示在圖片上分別添加的文本添加項、標籤添加項、及標尺添加項的示意圖。在添加了所述添加項之後,用戶可根據手勢或輸入進行以下至少一種修改:改變所述添加項的位置、改變所述添加項的尺寸、編輯所述添加項的內容、以及刪除所述添加項。例如,如圖6所示,對於圖中的標尺,用戶可透過手勢移動標尺的兩端,改變標尺的長度,可透過手勢旋轉標尺,改變標尺的角度,透過手勢刪除所述標尺等。
在一個實施例中,如參考圖2中所述,在用戶打開圖片之後,根據用戶選擇的圖片應用場景,獲取預設為與所述場景對應的至少一個圖形作為至少一個添加項,以及在所述圖片上分別添加根據所述場景獲取的至少一個添加項。其具體實例如參考圖2所述,在此不再贅述。
另外,在完成上述編輯之後,用戶還可以透過例如螢幕上的添加二維碼的介面對圖片添加二維碼,從而可以保存圖片,並分享圖片。在該分享圖片中,透過圖中的標籤準確明瞭地展現了商品的各個屬性,便於購買者對該商品進行快速地瞭解,從而促進了對商品的營銷。
圖7顯示根據本說明書實施例的一種圖片處理裝置700,包括:接收單元71,配置為,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;識別單元72,配置為,從所述語音識別出第一字串,作為添加項;以及添加單元73,配置為,在所述圖片上添加所述添加項。
圖8顯示根據本說明書實施例的一種圖片處理裝置800,包括:接收單元81,配置為,在用戶打開圖片之後,回應於用戶操作,接收用戶的語音;識別單元82,配置為,從所述語音識別出第一字串;第一獲取單元83,配置為,根據預設的關鍵字串庫,獲取與所述第一字串對應的至少一個第二字串、和/或與所述第一字串對應的至少一個圖形,作為至少一個添加項;以及第一添加單元84,配置為,在所述圖片上分別添加所述至少一個添加項。
在一個實施例中,在上述圖片處理裝置800中,所述第一獲取單元還配置為,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的字串,作為添加項。
在一個實施例中,在上述圖片處理裝置800中,所述第一獲取單元還配置為,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第三字串,其中,所述第三字串為表示量的單位的字串,並且在所述第一字串中,在所述第三字串之前為數字字串,以及,獲取順序包括所述數字字串和所述第三字串的字串作為添加項。
在一個實施例中,在上述圖片處理裝置800中,所述第一獲取單元還配置為,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第四字串作為添加項,其中,所述第四字串預設為對應於特定圖形,以及,獲取所述特定圖形作為添加項。
在一個實施例中,在上述圖片處理裝置800中,所述第一獲取單元還配置為,從所述第一字串中獲取與所述關鍵字串庫中的關鍵字串匹配的第五字串,其中,所述第五字串預設為對應於特定圖形,以及,獲取所述特定圖形作為添加項。
在一個實施例中,上述圖片處理裝置800還包括:第二獲取單元85,配置為,在用戶打開圖片之後,根據用戶選擇的圖片應用場景,獲取預設為與所述場景對應的至少一個圖形作為至少一個添加項,以及第二添加單元86,配置為,在所述圖片上分別添加根據所述場景獲取的至少一個添加項
在一個實施例中,上述圖片處理裝置800還包括,提示單元87,配置為,在接收用戶選擇的圖片應用場景之後,在螢幕上顯示與所述場景對應的語音輸入內容提示。
在一個實施例中,上述圖片處理裝置800還包括修改單元88,配置為,在圖片中添加所述添加項之後,根據用戶手勢或輸入進行以下至少一種修改:改變所述添加項的位置、改變所述添加項的尺寸、編輯所述添加項的內容、以及刪除所述添加項。
本說明書實施例還提供一種電腦可讀的儲存媒體,其上儲存有指令碼,所述指令碼在電腦中執行時,令電腦執行如上所述的圖片處理方法。
在根據本說明書實施例的圖片處理方法和裝置中,透過以語音輸入的方式對圖片打標籤,降低了圖片處理難度,大大提高了圖片處理效率,滿足了用戶的需求。
本發明所屬技術領域中具有通常知識者應該還可以進一步意識到,結合本文中所揭示的實施例描述的各示例的單元及演算法步驟,能夠以電子硬體、電腦軟體或者二者的結合來實現,為了清楚地說明硬體和軟體的可互換性,在上述說明中已經按照功能一般性地描述了各示例的組成及步驟。這些功能究竟以硬體還是軟體方式來執軌道,取決於技術方案的特定應用和設計約束條件。本發明所屬技術領域中具有通常知識者可以對每個特定的應用來使用不同方法來實現所描述的功能,但是這種實現不應認為超出本發明的範圍。
結合本文中所揭示的實施例描述的方法或演算法的步驟可以用硬體、處理器執軌道的軟體模組,或者二者的結合來實施。軟體模組可以置於隨機記憶體(RAM)、記憶體、唯讀記憶體(ROM)、電可編程ROM、電可抹除可編程ROM、暫存器、硬碟、可移動磁碟、CD-ROM、或技術領域內所公知的任意其它形式的儲存媒體中。
以上所述的具體實施方式,對本發明的目的、技術方案和有益效果進行了進一步詳細說明,所應理解的是,以上所述僅為本發明的具體實施方式而已,並不用於限定本發明的保護範圍,凡在本發明的精神和原則之內,所做的任何修改、等同替換、改進等,均應包含在本發明的保護範圍之內。After adding the added item, the user can perform various operations on the added item. For example, in the case of a user using a mobile phone, the user can change the position of the added item and the size of the added item through gestures, for example, by pressing the added item and sliding on the screen to adjust the added item to the new Position, rotate the added item with two fingers, adjust the angle of the added item, and adjust the size of the added item by sliding two fingers in the diagonal direction of the added item. In addition, the user can enter a new character in the added item or delete an existing character, or the user can display more operation buttons, such as a delete button, by long-pressing the added item to make changes. Multiple editing operations for the added item.
Fig. 3 shows a flowchart of a picture processing method according to an embodiment of the specification. The method includes: in step S31, after the user opens the picture, receiving the user's voice in response to the user's operation; in step S32, recognizing a first character string from the voice; in step S33, according to a preset keyword A string library, acquiring at least one second string corresponding to the first string and/or at least one graphic corresponding to the first string as at least one additional item; and in step S34, in the The at least one addition item is added to the picture respectively.
Steps S31 and S32 in this method are basically the same as steps S21 and S22 in FIG. 2 and will not be repeated here.
In step S33, at least one second character string corresponding to the first character string and/or at least one graphic corresponding to the first character string is acquired according to a preset keyword string library, as at least one addition item.
In one embodiment, the second character string is the first character string.
The keyword string library can be obtained through manual sorting or machine learning. It can include keyword strings corresponding to each specific scenario. For example, a specific scenario is a merchandise marketing scenario. In this scenario, in order to promote the item in the picture, the user needs to tag various attributes of the item. For example, the attributes include material, size, color, price, appearance, and so on. Therefore, in the keyword string library corresponding to the merchandise marketing scenario, the keyword strings related to the above-mentioned attributes may be included. For example, in the material category, keywords such as "pure copper", "plastic", and "glass" can be included, and in the size category, it can include "cm", "m", "cm", etc. The keyword string of the size unit, in the color category, can include "red", "lotus color", "magenta" and other color keywords, and in the price category, it can include "yuan" and "dollar" ", etc., represents a keyword string of currency units, and, in the appearance category, keyword strings such as "brushed metal", "polished" and the like represent appearance.
For another example, the scene is a matchmaking scene. In this scene, in order to introduce a character in a picture, the user needs to label the character with various character attributes. For example, the attributes include age, major, work unit, and so on. Then, the keyword string library corresponding to the matchmaking scene may include keyword strings corresponding to the above attributes, such as age unit (years), physics, biology, automation, company, office, etc.
For another example, the scene is a selfie scene. In this scene, the user can tag the self-portrait with mood and feeling labels. Therefore, the keyword string library corresponding to the scene may include keyword strings such as "happy, angry, and anxious".
In one embodiment, the user can select a picture application scenario. For example, after the user opens the picture, a scene option button may be displayed on the screen, and the user may select a desired picture application scene through the button, or the user may pre-select the picture application scene before opening the picture. After the user selects the scene, the system obtains the added item according to the preset keyword string library corresponding to the scene. For example, Figure 4 shows an example of a merchandise marketing scenario. After opening the picture as shown in Figure 4, the user can select "Commodity Marketing Scenario". Therefore, after the system recognizes the user's voice input as a word string, it calls the keyword string library corresponding to the commodity marketing scene to match the word string.
In one embodiment, after receiving the picture application scene selected by the user, the system displays a voice input content prompt corresponding to the scene on the screen before or after receiving the user's voice. Figure 5 schematically shows the voice input prompts on the screen in the merchandise marketing scenario, including "length 120 inside" (size), "metal is brushed and polished pure copper" (material), "new spring", "50 yuan to take "Price" (price), etc. The voice input content prompt can be preset corresponding to a specific scene.
In one embodiment, for example, after the user selects the merchandise marketing scene as described above, the user enters the voice "30cm in height, 35cm in width" by long pressing the microphone on the screen, the hardware material is pure copper hardware matte, and the decoration is perforated by round head nails , The price is 120 yuan". After the system recognizes the speech as a character string, it matches the character string with the keyword string in the keyword string library corresponding to the commodity marketing scenario. The keyword string "pure copper hardware matte" is included in the material classification of the keyword string library, and the keyword string "round head nail punching" is included in the appearance classification. Therefore, the "pure copper hardware matte" and "Punch nail hole" as an addition item to be added to the picture. In one embodiment, the keyword strings related to material and appearance are preset in the keyword string library to correspond to the label graphics. Therefore, after obtaining the added items "pure copper hardware matte" and "round head nail punching", the system also automatically obtains the label graphics as the added items. The label graphic is used to mark the specific position corresponding to the material of "pure copper hardware matte" and the specific position corresponding to the appearance of "round head nail punching" in the picture.
In one embodiment, it can be obtained from the above-mentioned character string, "cm" that matches the keyword string "cm" in the size classification of the keyword string library, and it can be judged that in the above-mentioned character string, " "cm" is a numeric string before, so the "30cm" and "35cm" in the obtained string are added to the picture as additional items. In one embodiment, "cm" is set in the keyword string library to correspond to the ruler graphic, so that after acquiring the addition items "30cm" and "35cm", the system also automatically obtains the ruler graphic as the addition item.
In one embodiment, the keyword string "yuan" is included in the price classification of the keyword string library, so that the keyword string "yuan" can be obtained from the above-mentioned string. And it can be judged that in the above-mentioned character string, "yuan" is preceded by a numeric character string, so the "120 Yuan" in the above-mentioned character string is obtained as an additional item and added to the picture.
In one embodiment, the size-related classification of the keyword strings includes the keyword strings "height" and "width", and the keyword string library sets "height" to correspond to the scale graphic. Therefore, after obtaining the keyword strings "height" and "width" in the string, the system obtains the ruler graphic as an additional item.
The added graphics are not limited to the above-mentioned labels and rulers, but may also be arrows, various geometric shapes for encircling, frame, etc. For example, the label can be set to correspond to the color, material, and other keyword strings in the keyword string library, and the scale can be set to correspond to the length or length unit in the keyword string. For example, in a self-portrait scene, it is also possible to add a frame corresponding to the content of the dialogue, an expression icon corresponding to the mood, etc. according to the keyword string matching.
Returning to FIG. 3 again, in step S34, the at least one addition item is added to the picture respectively. Fig. 6 shows a schematic diagram of text addition items, label addition items, and ruler addition items respectively added to the picture. After adding the added item, the user can make at least one of the following modifications according to the gesture or input: changing the position of the added item, changing the size of the added item, editing the content of the added item, and deleting the added item. item. For example, as shown in Figure 6, for the ruler in the figure, the user can move the two ends of the ruler through gestures to change the length of the ruler, rotate the ruler through gestures, change the angle of the ruler, delete the ruler through gestures, and so on.
In one embodiment, as described with reference to FIG. 2, after the user opens the picture, according to the picture application scene selected by the user, at least one graphic preset to correspond to the scene is obtained as at least one addition, and At least one additional item obtained according to the scene is added to the picture. The specific example is as described with reference to FIG. 2 and will not be repeated here.
In addition, after completing the above editing, the user can also add a QR code to the picture through, for example, an interface that adds a QR code on the screen, so that the picture can be saved and shared. In the shared picture, the various attributes of the product are accurately and clearly displayed through the labels in the picture, which is convenient for buyers to quickly understand the product, thereby promoting the marketing of the product.
FIG. 7 shows a
11‧‧‧顯示單元
12‧‧‧語音接收單元
13‧‧‧語音識別單元
14‧‧‧獲取單元
15‧‧‧關鍵字串庫
16‧‧‧圖片編輯單元
S21‧‧‧方法步驟
S22‧‧‧方法步驟
S23‧‧‧方法步驟
S31‧‧‧方法步驟
S32‧‧‧方法步驟
S33‧‧‧方法步驟
S34‧‧‧方法步驟
71‧‧‧接收單元
72‧‧‧識別單元
73‧‧‧添加單元
81‧‧‧接收單元
82‧‧‧識別單元
83‧‧‧第一獲取單元
84‧‧‧第一添加單元
85‧‧‧第二獲取單元
86‧‧‧第二添加單元
87‧‧‧提示單元
88‧‧‧修改單元
100‧‧‧系統
700‧‧‧圖片處理裝置
800‧‧‧圖片處理裝置11‧‧‧Display unit
12‧‧‧Voice receiving unit
13‧‧‧Voice recognition unit
14‧‧‧
透過結合圖式描述本說明書實施例,可以使得本說明書實施例更加清楚: 圖1示意顯示根據本說明書實施例的系統; 圖2顯示根據本說明書實施例的一種圖片處理方法的流程圖; 圖3顯示根據本說明書實施例的一種圖片處理方法的流程圖; 圖4顯示商品營銷場景的示例; 圖5示意示出在商品營銷場景下,螢幕上的語音輸入內容提示; 圖6顯示在圖片上分別添加的文本添加項、標籤添加項、及標尺添加項的示意圖; 圖7顯示根據本說明書實施例的一種圖片處理裝置;以及 圖8顯示根據本說明書實施例的一種圖片處理裝置。By describing the embodiments of this specification in combination with the figures, the embodiments of this specification can be made clearer: Figure 1 schematically shows a system according to an embodiment of the present specification; Fig. 2 shows a flowchart of a picture processing method according to an embodiment of this specification; Fig. 3 shows a flowchart of a picture processing method according to an embodiment of this specification; Figure 4 shows an example of a merchandise marketing scenario; Figure 5 schematically shows a voice input content prompt on the screen in a commodity marketing scenario; Figure 6 shows a schematic diagram of text addition items, label addition items, and ruler addition items respectively added to the picture; Fig. 7 shows a picture processing device according to an embodiment of the present specification; and Fig. 8 shows a picture processing apparatus according to an embodiment of the present specification.
Claims (25)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
??201810266755.4 | 2018-03-28 | ||
CN201810266755.4 | 2018-03-28 | ||
CN201810266755.4A CN108805958A (en) | 2018-03-28 | 2018-03-28 | A kind of image processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201942873A TW201942873A (en) | 2019-11-01 |
TWI698835B true TWI698835B (en) | 2020-07-11 |
Family
ID=64095398
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108101009A TWI698835B (en) | 2018-03-28 | 2019-01-10 | Image processing method and device and computer-readable storage medium |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN108805958A (en) |
TW (1) | TWI698835B (en) |
WO (1) | WO2019184539A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805958A (en) * | 2018-03-28 | 2018-11-13 | 阿里巴巴集团控股有限公司 | A kind of image processing method and device |
JP6807621B1 (en) | 2020-08-05 | 2021-01-06 | 株式会社インタラクティブソリューションズ | A system for changing images based on audio |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7921037B2 (en) * | 2002-04-01 | 2011-04-05 | Hewlett-Packard Development Company, L.P. | Personalized messaging determined from detected content |
TWI402767B (en) * | 2008-11-28 | 2013-07-21 | Hon Hai Prec Ind Co Ltd | Electronic apparatus capable for editing photo and method thereof |
CN103365970A (en) * | 2013-06-25 | 2013-10-23 | 广东小天才科技有限公司 | Method and device for automatically acquiring learning material information |
CN104766353A (en) * | 2015-04-25 | 2015-07-08 | 陈包容 | Method and device for adding text content into background |
TWI534647B (en) * | 2015-07-07 | 2016-05-21 | 中華電信股份有限公司 | Customizable picture template system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2409365B (en) * | 2003-12-19 | 2009-07-08 | Nokia Corp | Image handling |
CN105302786B (en) * | 2015-11-10 | 2019-05-24 | 百度在线网络技术(北京)有限公司 | The edit methods and device of data |
CN107707836A (en) * | 2017-09-11 | 2018-02-16 | 广东欧珀移动通信有限公司 | Image processing method and device, electronic installation and computer-readable recording medium |
CN108805958A (en) * | 2018-03-28 | 2018-11-13 | 阿里巴巴集团控股有限公司 | A kind of image processing method and device |
-
2018
- 2018-03-28 CN CN201810266755.4A patent/CN108805958A/en active Pending
-
2019
- 2019-01-02 WO PCT/CN2019/070040 patent/WO2019184539A1/en active Application Filing
- 2019-01-10 TW TW108101009A patent/TWI698835B/en active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7921037B2 (en) * | 2002-04-01 | 2011-04-05 | Hewlett-Packard Development Company, L.P. | Personalized messaging determined from detected content |
TWI402767B (en) * | 2008-11-28 | 2013-07-21 | Hon Hai Prec Ind Co Ltd | Electronic apparatus capable for editing photo and method thereof |
CN103365970A (en) * | 2013-06-25 | 2013-10-23 | 广东小天才科技有限公司 | Method and device for automatically acquiring learning material information |
CN104766353A (en) * | 2015-04-25 | 2015-07-08 | 陈包容 | Method and device for adding text content into background |
TWI534647B (en) * | 2015-07-07 | 2016-05-21 | 中華電信股份有限公司 | Customizable picture template system |
Also Published As
Publication number | Publication date |
---|---|
TW201942873A (en) | 2019-11-01 |
WO2019184539A1 (en) | 2019-10-03 |
CN108805958A (en) | 2018-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP4128672B1 (en) | Combining first user interface content into second user interface | |
CN105830011B (en) | For overlapping the user interface of handwritten text input | |
TWI720062B (en) | Voice input method, device and terminal equipment | |
US11620001B2 (en) | Pictorial symbol prediction | |
CN108156503B (en) | Method and device for generating gift | |
RU2488232C2 (en) | Communication network and devices for text to speech and text to facial animation conversion | |
US20210405831A1 (en) | Updating avatar clothing for a user of a messaging system | |
CN114787813A (en) | Context sensitive avatar captions | |
JP2020017285A (en) | Sharing of user configurable graphic structure | |
WO2016000536A1 (en) | Method for activating application program, user terminal and server | |
US11769500B2 (en) | Augmented reality-based translation of speech in association with travel | |
US20210304451A1 (en) | Speech-based selection of augmented reality content for detected objects | |
TWI698835B (en) | Image processing method and device and computer-readable storage medium | |
WO2020221103A1 (en) | Method for displaying user emotion, and device | |
US20230091214A1 (en) | Augmented reality items based on scan | |
CN106791091B (en) | Image generation method and device and mobile terminal | |
KR20220155601A (en) | Voice-based selection of augmented reality content for detected objects | |
JP2017174090A (en) | Information processing apparatus and program | |
CN107609487B (en) | User head portrait generation method and device | |
CN111582281B (en) | Picture display optimization method and device, electronic equipment and storage medium | |
CN115035222A (en) | Electronic business card control method and system and electronic equipment | |
WO2021184153A1 (en) | Summary video generation method and device, and server | |
US10776656B2 (en) | Methods and systems for applying content aware stickers onto a layout | |
JP5801104B2 (en) | Automatic production of short video works based on HTML documents | |
CN110738031A (en) | Method, device and equipment for generating reading note |