CN114727138B

CN114727138B - Commodity information processing method, commodity information processing device and computer equipment

Info

Publication number: CN114727138B
Application number: CN202210345634.5A
Authority: CN
Inventors: 李洁倩
Original assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Current assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2023-12-19
Anticipated expiration: 2042-03-31
Also published as: CN114727138A

Abstract

The application relates to a commodity information processing method, a commodity information processing device, computer equipment and a storage medium, wherein the commodity information processing method comprises the following steps: receiving audio and video data of a live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of a vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity. According to the method, the audio and video data of the live broadcasting room can be played at the vehicle-mounted terminal of the vehicle, the target commodity which the user intends to purchase on the live broadcasting room is determined at the vehicle-mounted terminal of the vehicle, and a data processing basis is provided for realizing live broadcasting shopping at the vehicle-mounted terminal.

Description

Commodity information processing method, commodity information processing device and computer equipment

Technical Field

The present disclosure relates to the field of vehicle processing technologies, and in particular, to a commodity information processing apparatus, a commodity information processing device, a computer device, and a storage medium.

Background

With the increasing of live shopping, many shopping platforms are on line with the function of live shopping, and the scale of live users of electronic commerce reaches 3.09 hundred million by 6 months in 2020, and the scale of network retail users reaches 7.49 hundred million. Compared with the plane picture of the online platform, live shopping is more visual, more real and stronger in interactivity, and more visual commodities can be presented to consumers. Through the data, live shopping can be seen to have a certain prospect and market. However, the conventional vehicle-mounted device has not introduced the function of live shopping due to technical and safety considerations.

Disclosure of Invention

Accordingly, in view of the above-mentioned technical problems, it is necessary to provide a commodity information processing apparatus, a computer device and a storage medium, which are capable of playing audio and video data of a live broadcasting room at a vehicle-mounted terminal of a vehicle, determining a target commodity intended to be purchased by a user for the live broadcasting room at the vehicle-mounted terminal of the vehicle, and further providing a data processing basis for realizing live broadcasting shopping at the vehicle-mounted terminal.

A commodity information processing method, comprising: receiving audio and video data of a live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of a vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In one embodiment, determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the first key information.

In one embodiment, determining first critical information in audio data by a speech recognition technique includes: acquiring volume values of each audio segment data of the audio data; screening target audio segment data with the volume value meeting a first setting condition according to the volume value of each audio segment data; converting the target audio segment data into a text by a voice recognition technology, and screening target segmentation words meeting a second set condition from the text; and determining first key information according to the target segmentation.

In one embodiment, determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring video data in the audio and video data; determining second key information in the video data through an image recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the second key information.

In one embodiment, determining the second key information in the video data by image recognition techniques includes: acquiring the shape and the size of an object in each frame of image in video data through an image recognition technology; and determining second key information in the video data according to the shape and the size of the object in each frame of image.

In one embodiment, determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data; acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data; filtering the third key information and the fourth key information to obtain target key information; and determining commodity attribute information of each commodity in the living broadcast room according to the target key information.

In one embodiment, after the step of determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity, the method further includes: receiving second voice data for indicating to purchase the target commodity; and performing a purchasing operation of the target commodity according to the second voice information.

A commodity information processing apparatus comprising: the playing module is used for receiving the audio and video data of the live broadcasting room and playing the audio and video data at the vehicle-mounted terminal of the vehicle; the first determining module is used for determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; the extraction module is used for receiving first voice data of a user in the vehicle and extracting target commodity attribute information from the first voice data; and the second determining module is used for determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods of the embodiments described above when the computer program is executed by the processor.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the embodiments described above.

The commodity information processing method, the commodity information processing device, the computer equipment and the storage medium are used for receiving the audio and video data of the live broadcasting room and playing the audio and video data at the vehicle-mounted terminal of the vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity. Therefore, the audio and video data of the live broadcasting room can be played at the vehicle-mounted terminal of the vehicle, and the target commodity which the user intends to purchase the live broadcasting room is determined at the vehicle-mounted terminal of the vehicle, so that a data processing basis is provided for realizing live broadcasting shopping at the vehicle-mounted terminal.

Drawings

FIG. 1 is an application environment diagram of a merchandise information processing method according to one embodiment;

FIG. 2 is a flow chart of a method for processing merchandise information according to an embodiment;

FIG. 3 is a schematic diagram of a system frame of a method for processing merchandise information in an embodiment;

FIG. 4 is a schematic diagram of an interface for displaying a target commodity by the vehicle-mounted terminal according to an embodiment;

FIG. 5 is a schematic diagram illustrating a process of identifying a frame image in a video at a server in one embodiment;

FIG. 6 is a schematic diagram of a processing flow of the service end to the commodity attribute information uploaded by the vehicle-mounted terminal in one embodiment;

FIG. 7 is a block diagram showing a structure of a commodity information processing apparatus according to one embodiment;

fig. 8 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The commodity information processing method provided by the application is applied to an application environment shown in fig. 1. In one embodiment, as shown in fig. 1, the in-vehicle terminal 102 performs a commodity information processing method of the present application. Specifically, the vehicle-mounted terminal 102 receives audio and video data from the live broadcasting room of the live broadcasting party 104, plays the audio and video data at the vehicle-mounted terminal 102 of the vehicle, determines commodity attribute information of each commodity in the live broadcasting room according to the audio and video data, receives first voice data of a user in the vehicle, extracts target commodity attribute information from the first voice data, and determines a target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In one embodiment, as shown in fig. 2, a commodity information processing method is applied to the vehicle-mounted terminal shown in fig. 1, and includes the following steps:

s202, receiving audio and video data of the live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of the vehicle.

In this embodiment, the vehicle-mounted terminal establishes a data interaction link with the live broadcasting party, and receives audio and video data from the live broadcasting room of the live broadcasting party through the data interaction link. The audio and video data of the live broadcasting room comprise video data and audio data. The video data comprises a sequence of consecutive images in the living room and the audio data comprises various speech data of the living room. When the vehicle-mounted terminal displays the audio and video data of the live broadcasting room, a user on the vehicle can directly acquire commodity information of one or more commodities in the live broadcasting room.

S204, determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data.

In this embodiment, when the vehicle-mounted terminal receives the audio and video data of the live broadcasting room, commodity information of each commodity in the live broadcasting room can be obtained through the audio and video data. The commodity information includes commodity attribute information including one or more of a style, a color, a size, and a price of the commodity.

S206, receiving first voice data of a user in the vehicle, and extracting target commodity attribute information from the first voice data.

In this embodiment, the vehicle-mounted terminal displays audio and video data of the live broadcasting room for the user, and the user can acquire various commodities recommended by the live broadcasting room. When a user needs to purchase a certain commodity, a voice can be sent to the vehicle-mounted terminal. The vehicle-mounted terminal receives first voice data of a user and acquires target commodity attribute information from the first voice data. The target commodity attribute information is commodity attribute information indicating a target commodity purchased by the user. The target commodity attribute information may include one or more of a style, color, size, and price of the commodity.

S208, determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In this embodiment, the target commodity attribute information and the commodity attribute information of each commodity are matched, and the target commodity corresponding to the target commodity attribute information is determined. The vehicle-mounted terminal can determine the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity. The vehicle-mounted terminal may send the target commodity attribute information to the server, the server includes commodity attribute information of each commodity, and the server determines the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity and then sends the target commodity to the vehicle-mounted terminal.

Specifically, the server provides background service for the vehicle-mounted terminal, and establishes a data interaction interface which is used for carrying out data interaction with the vehicle-mounted terminal. The method comprises the steps that a server side obtains commodity information of a plurality of commodities displayed by audio and video data of a live broadcasting room from a live broadcasting party, the commodity information comprises commodity attribute information, and the commodity attribute information comprises one or more of the color, the size and the style of the commodity. After receiving the target commodity attribute information, the server performs matching identification on the target commodity attribute information and the acquired commodity attribute information of each commodity in the live broadcasting room, and then determines the target commodity.

In one example, determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity includes: the target commodity attribute information is sent to a server, and target commodities issued by the server are received; the server side comprises commodity attribute information of one or more commodities, matches the target commodity attribute information with commodity attribute information of each commodity, and determines the target commodity according to a matching result.

Specifically, an interactive link is established between a management background of the server and the live broadcasting party, and the management background pulls commodity information of one or more commodities put on shelf in the live broadcasting party through the interactive link. The commodity information contains commodity attribute information including commodity name, commodity color, commodity size, etc. When one or more commodities on the shelf of the live broadcasting room are live broadcasting, the host broadcasts the one or more commodities which are taught and displayed in the live broadcasting room, namely the one or more commodities corresponding to the audio and video data in the live broadcasting room. When the management background pulls the commodity information of one or more commodities to the live broadcasting room, the commodity information is distributed according to the commodity ID, and commodity attribute information of each commodity is extracted and marked.

The management background establishes an interactive link with the live broadcasting party, after one or more commodities are put on the live broadcasting party, the management background can distribute and classify commodity information of the one or more commodities put on the live broadcasting party, establish commodity IDs, extract commodity attribute information of each commodity, wherein the commodity attribute information of each commodity comprises commodity names, commodity colors, commodity sizes and the like, and issue the commodity and commodity attribute information of each commodity to the vehicle-mounted terminal.

And the management background stores commodity attribute information of one or more commodities corresponding to the audio and video data of the live broadcasting room, and when receiving the target commodity attribute information uploaded by the vehicle-mounted terminal, the management background matches the target commodity attribute information with the commodity attribute information of the one or more commodities, and takes the commodity with the highest matching degree as the target commodity. Of course, a commodity of any matching degree may be used as the target commodity. And determining the corresponding commodity as the target commodity as long as the matching result reaches the set matching condition.

After the management background determines the target commodity, commodity information of the target commodity is issued to the vehicle-mounted terminal. The commodity information here includes commodity attribute information including any one or more of a style, a color, and a size of a commodity, and may also include a commodity price, a commodity image, and the like.

The commodity information processing method is used for receiving the audio and video data of the live broadcasting room and playing the audio and video data at the vehicle-mounted terminal of the vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity. Therefore, the audio and video data of the live broadcasting room can be played at the vehicle-mounted terminal of the vehicle, and the target commodity which the user intends to purchase the live broadcasting room is determined at the vehicle-mounted terminal of the vehicle, so that a data processing basis is provided for realizing live broadcasting shopping at the vehicle-mounted terminal.

In one embodiment, the determining the commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the first key information.

In this embodiment, the audio/video data of the live broadcasting room may include the target commodity or may not include the target commodity. As shown in fig. 3, the vehicle-mounted terminal plays audio and video data in the live broadcasting room, and monitors audio data in the played audio and video data. First key information in the audio data is determined by a speech recognition technique. The first key information may be a keyword. The determining of the first key information in the audio data by the speech recognition technique may be: and acquiring a target audio segment in the audio data, and acquiring keywords representing commodity attributes from the target audio segment by adopting a voice recognition technology.

The intercepted target audio segment can be an audio segment for introducing and explaining commodity by a host in a live broadcasting room. The target audio segment may be subjected to speech-to-text processing by a speech recognition technique to obtain one or more keywords. The method comprises the steps that a data interaction link is established between the vehicle-mounted terminal and a live broadcast party, online ASR (Automatic Speech Recognition, automatic speech recognition technology) capability is established in the vehicle-mounted terminal, and the vehicle-mounted terminal utilizes the online ASR to recognize keywords in a target audio segment to obtain keywords representing commodity attributes, so that the commodities which are being explained by a host broadcast are distinguished. The vehicle-mounted terminal can convert the audio data of the commodity introduction which is being explained by the anchor into characters through an on-line ASR function, transmit the characters to a management background of the server, convert the key information into keywords by the management background, and record the keywords under the ID of the corresponding commodity of the vehicle-mounted terminal.

For example, an online ASR capability is established at the vehicle-mounted terminal, and keywords of the target audio segment are identified to distinguish the goods being explained by the anchor. For example, a data interaction interface is established between the vehicle-mounted terminal and a management background of the server, the vehicle-mounted terminal receives the anchor voice data in the audio and video data of the live broadcasting room through a microphone array and other sound receiving equipment, carries out voice recognition on the anchor voice data, sends the anchor voice data to a cloud voice recognition engine through a bus structure so as to recognize natural language in the anchor voice data, obtains a corresponding target text, carries out word segmentation on the target text, and extracts one or more keywords. And sending one or more keywords to a management background, matching the keywords with commodity attribute information of each commodity in the management background, and judging that the commodity determined by the keywords and the commodity of the commodity attribute information are the same commodity if the matched similarity is more than 80%, wherein the keywords are taken as commodity attribute information of the commodity. Therefore, the management background of the server side is convenient for managing the commodity attribute information of the commodity.

In one embodiment, determining the first key information in the audio data by the voice recognition technology includes: acquiring volume values of each audio segment data of the audio data; screening target audio segment data with the volume value meeting a first setting condition according to the volume value of each audio segment data; converting the target audio segment data into a text by a voice recognition technology, and screening target segmentation words meeting a second set condition from the text; and determining first key information according to the target segmentation.

In this embodiment, the first setting condition may be: and setting a target volume value interval, and determining the audio segment data corresponding to the volume value as target audio segment data when the volume value is in the target volume value interval. Target audio segment data with the volume value meeting a first setting condition is screened out according to the volume value of each audio segment data, specifically: and matching the volume value of each audio segment data with a target volume value interval in a first setting condition, and if the volume value of any audio segment data is in the target volume value interval, determining that any audio segment data is the target audio segment data.

And carrying out natural language recognition on the target audio segment to obtain a corresponding text. Setting a second setting condition, wherein the second setting condition can be a preset phrase, and the phrase comprises one or more commodity attribute words. The target word which meets the second setting condition is screened from the text, and the target word can be: the text is segmented, each segmented word is matched with one or more commodity attribute words in the phrase, and the successfully matched segmented word is used as a target segmented word. The determining of the first key information according to the target word may be: the first key information includes a keyword, and the target segmentation is used as the keyword of the first key information. Or, identifying the meaning of each target word, and generating the first key information based on the meaning of each target word.

In one embodiment, the determining the commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring video data in the audio and video data; determining second key information in the video data through an image recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the second key information.

In this embodiment, one or more items of the living room may or may not include the target items. As shown in fig. 3, the vehicle-mounted terminal plays the audio and video data of the live broadcasting room, monitors the played video data, and determines the second key information in the video data through the image recognition technology. The second key information may be one or more of a key word, a frame image, and some portion of image data. The determining of the second key information in the video data by the image recognition technology may be: acquiring a target video segment in video data; and acquiring a frame image containing one or more commodity attribute information from the target video segment by adopting a video analysis technology, and determining second key information according to the frame image. For example, a frame image is taken as the second key information. Or, the commodity attribute information such as the shape, color, size and the like of the commodity is identified from the frame image, and the commodity attribute information such as the shape, color, size and the like of the commodity is used as the second key information.

The method can also be that an AI video analysis function is established at the vehicle-mounted terminal, the introduction video of the anchor is analyzed, the second key information of the commodity is marked, the second key information of the commodity is uploaded to the server, and the server records the ID of the target commodity of the vehicle-mounted terminal, so that the server can manage the commodity attribute information of the target commodity conveniently.

In one embodiment, the determining the second key information in the video data by the image recognition technology includes: acquiring the shape and the size of an object in each frame of image in video data through an image recognition technology; and determining second key information in the video data according to the shape and the size of the object in each frame of image.

In this embodiment, the video data comprises a plurality of frames of images, each frame of images comprising one or more items. And acquiring the shape and the size of the object in each frame of image through an image recognition technology, and further determining second key information in the video data based on the shape and the size of the object in each frame of image. The second key information may comprise image information of the shape and size of the article and/or alphanumeric information of the shape and size of the article.

In one embodiment, the determining the commodity attribute information of each commodity in the live broadcasting room according to the audio and video data includes: acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data; acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data; filtering the third key information and the fourth key information to obtain target key information; and determining commodity attribute information of each commodity in the living broadcast room according to the target key information.

In the embodiment, third key information of each commodity is obtained through audio data in audio-video data, fourth key information of each commodity is obtained through video data in the audio-video data, and then the third key information and the fourth key information are filtered, so that information which is not used for representing commodity attributes is filtered, target key information is screened out, and commodity attribute information of each commodity in a living broadcast room is determined according to the target key information.

In this embodiment, when the user determines that the target commodity is a commodity that the user wants to purchase, the second voice data for purchasing the target commodity may be sent to the vehicle-mounted terminal. The second voice data may contain operation information indicated by a voice of the user. When receiving second voice data of the target commodity purchased by the user, the vehicle-mounted terminal responds to the operation information in the second voice data, and further performs the purchasing operation of the target commodity. If the vehicle-mounted terminal displays the target commodity, the user sends second voice data for representing and determining to purchase the target commodity to the vehicle-mounted terminal, and the vehicle-mounted terminal initiates a purchase request operation to a purchase link provided by a live party so as to execute the purchase of the target commodity.

For example, the vehicle-mounted terminal displays the target commodity with the highest matching value issued by the server side for verification by a user, and the interface display of the vehicle-mounted terminal displaying the target commodity is shown in fig. 4. After the user confirms, clicking or voice selecting to purchase on the interface, and the vehicle-mounted terminal completes the purchase payment operation of the target commodity. Therefore, the vehicle-mounted terminal can realize vehicle-mounted live shopping of the user.

In an embodiment, after the step of receiving audio and video data in the live broadcast room, the method further includes: converting the audio and video data into text information, and sending the text information to a server so that the server obtains commodity attribute information of each commodity in the live broadcasting room from the text information; the determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity comprises the following steps: and sending the target commodity attribute information to the server side, so that the server side determines the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity, and receives the target commodity issued by the server side.

In the embodiment, the server receives text information uploaded by the vehicle-mounted terminal; acquiring keywords representing commodity attributes of one or more commodities from the text information; matching one or more keywords with commodity attribute information of each commodity; identifying keywords with failed matching, and identifying first commodities corresponding to the keywords with failed matching; and associating the keywords with the matching failure with the first commodity to take the keywords with the matching failure as commodity attribute information of the first commodity. Thus, missing article attribute information of each article can be filled. Further, the server determines the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

The vehicle-mounted terminal can convert the commodity introduction which is being explained by the anchor into text information through an online ASR function, and transmit the text information to a management background, and the management background converts key information in the text information into keywords and records the keywords under the ID of corresponding commodities of the vehicle-mounted terminal.

For example, in the live broadcast process, the vehicle-mounted terminal receives the anchor voice through a sound receiving device such as a microphone array, performs voice recognition, and sends voice data to a cloud voice recognition engine through a bus structure to perform natural language recognition on the anchor voice data input by a user, so as to obtain a corresponding target text. The target text is a form of the text information described above. The method comprises the steps of word segmentation of target texts, extraction of keywords representing commodity attributes, comparison of the extracted keywords with commodity attribute information of a management background after identification of commodity IDs corresponding to the keywords, and addition of the keywords under the commodity IDs if relevant attributes are absent.

Illustrating: the commercial ID of the cashmere overcoat is in the management background, the marked color attribute is white and gray coffee, and in the live broadcast process, the commercial product introduced by the anchor is very fashionable in cutting, white, gray coffee and caramel, when the three colors can be selected:

1. The management background analyzes the text information introduced by the anchor uploaded by the vehicle-mounted terminal in real time, and changes the sentence into a sentence of 'the piece/cashmere overcoat/cutting/very fashion/present/white/gray coffee/caramel/trichromatic/selectable'.

2. The cashmere overcoat, white, gray coffee and caramel are marked as keywords.

3. The caramel color is used as a color attribute supplemented by the anchor and is recorded under the commodity ID of the cashmere overcoat after matching with the commodity attribute of the commodity in the management background.

In an embodiment, after the step of receiving audio and video data in the live broadcast room, the method further includes: converting the audio and video data to obtain a frame image; the frame image is sent to a server side, so that the server side obtains commodity attribute information of each commodity in the live broadcasting room from the frame image; the determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity comprises the following steps: and sending the target commodity attribute information to the server side, so that the server side determines the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity, and receives the target commodity issued by the server side.

In the embodiment, an AI video analysis function is established at the vehicle-mounted terminal, an introduction video of a host in a live broadcast room is analyzed, frame images are intercepted and uploaded, and a management background of a server side analyzes the frame images to obtain commodity attribute information of each commodity. And recording commodity attribute information of each commodity under the commodity ID of the corresponding commodity of the vehicle-mounted terminal. Thus, missing article attribute information of each article can be filled. Further, the server determines the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

The server acquires commodity attribute information of each commodity in the live broadcasting room from the frame image, and the commodity attribute information comprises: the server inputs the frame images into the trained image recognition model to obtain commodity attribute information of each commodity output by the image recognition model.

Specifically, in the live broadcast process of playing audio and video data in a live broadcast room by the vehicle-mounted terminal, the vehicle-mounted terminal can automatically intercept live broadcast video clips and upload the live broadcast video clips to a management background. The management background performs feature extraction on the uploaded frame image based on analysis of the image and auxiliary analysis of video voice. And extracting the characteristics of the uploaded frame images by adopting a trained image recognition model to obtain various commodity labels, wherein the commodity labels are a representation form of commodity attribute information. The specific extraction method can be seen in fig. 5. And analyzing the commodity label, confirming the commodity ID, and recording the commodity label under the commodity ID.

The implementation flow of the above two embodiments is shown in fig. 6. Specifically, the on-line ASR technology of the vehicle-mounted terminal converts audio data in the live broadcasting room into text information and uploads the text information to the server, and/or the vehicle-mounted terminal converts video data in the live broadcasting room into frame images through the video capturing technology and uploads the frame images to the server. The text information and the frame image are commodity attribute information of the commodity. The server performs the correlation matching operation with reference to the flowchart shown in fig. 6. If the matching degree is greater than 80%, the uploaded commodity attribute information is indicated to be the commodity attribute information of a certain commodity recorded in the server, whether the uploaded commodity attribute information further comprises the commodity attribute information which is not recorded in the commodity or not is further determined, and if yes, the commodity attribute information which is not recorded in the commodity is recorded under the commodity ID. If the matching degree is less than 80%, the uploaded commodity attribute information is indicated not to belong to the commodity attribute information of any commodity recorded in the server, at the moment, the server creates the commodity ID of the commodity of the uploaded commodity attribute information, and records the uploaded commodity attribute information under the created commodity ID. Thus, missing article attribute information of each article can be filled.

In summary, the commodity information processing method has the following advantages:

1. when the vehicle-mounted terminal plays the audio and video data of the live broadcasting room, the commodity attribute information introduced by the anchor in the live broadcasting room can be extracted as a keyword, and the keyword is uploaded to the server to fill the commodity attribute information missing in the server.

2. When the vehicle-mounted terminal plays the audio and video data of the live broadcasting room, the video segment of the commodity introduced by the anchor is intercepted, the intercepted frame image is uploaded to the server, and the server extracts commodity attribute information in the frame image, such as category, color and the like, and extracts the commodity attribute information as keywords, so that the missing commodity attribute information in the server can be filled.

3. The vehicle-mounted terminal can extract keywords from the voice TTS of the user and send the keywords to the management background of the server side for keyword matching, so that the commodity with the highest similarity is matched and pushed to the user.

In one embodiment, the present application further provides a merchandise information processing device, as shown in fig. 7, which includes a playing module 702, a first determining module 704, an extracting module 706, and a second determining module 708. The playing module 702 is configured to receive audio and video data of a live broadcasting room, and play the audio and video data at a vehicle-mounted terminal of a vehicle; a first determining module 704, configured to determine commodity attribute information of each commodity in the living broadcast room according to the audio/video data; an extracting module 706, configured to receive first voice data of a user in the vehicle, and extract target commodity attribute information from the first voice data; the second determining module 708 is configured to determine the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In one embodiment, determining merchandise attribute information of each merchandise in the living broadcast room according to the audio-video data includes: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the first key information.

In one embodiment, determining merchandise attribute information of each merchandise in the living broadcast room according to the audio-video data includes: acquiring video data in the audio and video data; determining second key information in the video data through an image recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the second key information.

In one embodiment, determining merchandise attribute information of each merchandise in the living broadcast room according to the audio-video data includes: acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data; acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data; filtering the third key information and the fourth key information to obtain target key information; and determining commodity attribute information of each commodity in the living broadcast room according to the target key information.

In one embodiment, after determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity, the method further comprises: receiving second voice data for indicating to purchase the target commodity; and performing a purchasing operation of the target commodity according to the second voice information.

The specific limitation concerning a commodity information processing apparatus may be referred to above as limitation concerning a commodity information processing method, and will not be described here. Each module in the above-described commodity information processing apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be an in-vehicle terminal, and an internal structure diagram thereof may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with the server through network connection. The computer program is executed by a processor to implement a merchandise information processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program: receiving audio and video data of a live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of a vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In one embodiment, when the processor executes the computer program to determine the commodity attribute information of each commodity in the live broadcast room according to the audio and video data, the following steps are specifically implemented: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the first key information.

In one embodiment, when the processor executes the computer program to determine the first key information in the audio data by the speech recognition technology, the following steps are specifically implemented: acquiring volume values of each audio segment data of the audio data; screening target audio segment data with the volume value meeting a first setting condition according to the volume value of each audio segment data; converting the target audio segment data into a text by a voice recognition technology, and screening target segmentation words meeting a second set condition from the text; and determining first key information according to the target segmentation.

In one embodiment, when the processor executes the computer program to determine the commodity attribute information of each commodity in the live broadcast room according to the audio and video data, the following steps are specifically implemented: acquiring video data in the audio and video data; determining second key information in the video data through an image recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the second key information.

In one embodiment, when the processor executes the computer program to determine the second key information in the video data through the image recognition technology, the following steps are specifically implemented: acquiring the shape and the size of an object in each frame of image in video data through an image recognition technology; and determining second key information in the video data according to the shape and the size of the object in each frame of image.

In one embodiment, when the processor executes the computer program to determine the commodity attribute information of each commodity in the live broadcast room according to the audio and video data, the following steps are specifically implemented: acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data; acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data; filtering the third key information and the fourth key information to obtain target key information; and determining commodity attribute information of each commodity in the living broadcast room according to the target key information.

In one embodiment, the processor when executing the computer program further performs the steps of: receiving second voice data for indicating to purchase the target commodity; and performing a purchasing operation of the target commodity according to the second voice information.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: receiving audio and video data of a live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of a vehicle; determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; receiving first voice data of a user in a vehicle, and extracting target commodity attribute information from the first voice data; and determining the target commodity according to the target commodity attribute information and the commodity attribute information of each commodity.

In one embodiment, when the computer program is executed by the processor to determine commodity attribute information of each commodity in the live broadcasting room according to the audio and video data, the following steps are specifically implemented: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the first key information.

In one embodiment, when the computer program is executed by the processor to determine the first key information in the audio data by the speech recognition technique, the following steps are specifically implemented: acquiring volume values of each audio segment data of the audio data; screening target audio segment data with the volume value meeting a first setting condition according to the volume value of each audio segment data; converting the target audio segment data into a text by a voice recognition technology, and screening target segmentation words meeting a second set condition from the text; and determining first key information according to the target segmentation.

In one embodiment, when the computer program is executed by the processor to determine commodity attribute information of each commodity in the live broadcasting room according to the audio and video data, the following steps are specifically implemented: acquiring video data in the audio and video data; determining second key information in the video data through an image recognition technology; and determining commodity attribute information of each commodity in the living broadcast room according to the second key information.

In one embodiment, when the computer program is executed by the processor to determine the second key information in the video data by the image recognition technique, the following steps are specifically implemented: acquiring the shape and the size of an object in each frame of image in video data through an image recognition technology; and determining second key information in the video data according to the shape and the size of the object in each frame of image.

In one embodiment, when the computer program is executed by the processor to determine commodity attribute information of each commodity in the live broadcasting room according to the audio and video data, the following steps are specifically implemented: acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data; acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data; filtering the third key information and the fourth key information to obtain target key information; and determining commodity attribute information of each commodity in the living broadcast room according to the target key information.

In one embodiment, the computer program when executed by the processor further performs the steps of: receiving second voice data for indicating to purchase the target commodity; and performing a purchasing operation of the target commodity according to the second voice information.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A merchandise information processing method, the method comprising:

receiving audio and video data of a live broadcasting room, and playing the audio and video data at a vehicle-mounted terminal of a vehicle;

determining commodity attribute information of each commodity in the live broadcasting room according to the audio and video data; the determining the commodity attribute information of each commodity in the live broadcasting room according to the audio and video data comprises the following steps: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; the determining the first key information in the audio data by the voice recognition technology comprises: acquiring a target audio segment in audio data, and acquiring a keyword representing commodity attributes from the target audio segment by adopting a voice recognition technology, wherein the target audio segment is an audio segment for introducing and explaining commodities by a host in a live broadcasting room; determining commodity attribute information of each commodity in the live broadcasting room according to the first key information;

Receiving first voice data of a user in the vehicle, and extracting target commodity attribute information from the first voice data;

matching the target commodity attribute information with the commodity attribute information of each commodity to determine a target commodity corresponding to the target commodity attribute information.

2. The method of claim 1, wherein the determining the first key information in the audio data by a speech recognition technique comprises:

acquiring the volume value of each audio segment data of the audio data;

screening target audio segment data with the volume value meeting a first setting condition according to the volume value of each audio segment data;

converting the target audio segment data into a text through a voice recognition technology, and screening target segmentation words meeting a second setting condition from the text;

and determining the first key information according to the target word segmentation.

3. The method of claim 1, wherein determining merchandise attribute information for each merchandise in the living room from the audio-visual data comprises:

acquiring video data in the audio and video data;

determining second key information in the video data through an image recognition technology;

And determining commodity attribute information of each commodity in the live broadcasting room according to the second key information.

4. A method according to claim 3, wherein said determining second key information in said video data by image recognition techniques comprises:

acquiring the shape and the size of an object in each frame of image in the video data through an image identification technology;

and determining second key information in the video data according to the shape and the size of the object in each frame of image.

5. The method of claim 1, wherein determining merchandise attribute information for each merchandise in the living room from the audio-visual data comprises:

acquiring third key information of each commodity in the live broadcasting room according to the audio data in the audio-video data;

acquiring fourth key information of each commodity in the live broadcasting room according to video data in the audio and video data;

filtering the third key information and the fourth key information to obtain target key information;

and determining commodity attribute information of each commodity in the live broadcasting room according to the target key information.

6. The method according to claim 1, wherein after the step of matching the target commodity attribute information with the commodity attribute information of each commodity to determine a target commodity corresponding to the target commodity attribute information, further comprises:

Receiving second voice data of the user for indicating to purchase the target commodity;

and executing the purchasing operation of the target commodity according to the second voice data.

7. A commodity information processing apparatus, the apparatus comprising:

the playing module is used for receiving the audio and video data of the live broadcasting room and playing the audio and video data at the vehicle-mounted terminal of the vehicle;

the first determining module is configured to determine, according to the audio and video data, commodity attribute information of each commodity in the live broadcasting room, where determining, according to the audio and video data, commodity attribute information of each commodity in the live broadcasting room includes: acquiring audio data in the audio-video data; determining first key information in the audio data through a voice recognition technology; the determining the first key information in the audio data by the voice recognition technology comprises: acquiring a target audio segment in audio data, and acquiring a keyword representing commodity attributes from the target audio segment by adopting a voice recognition technology, wherein the target audio segment is an audio segment for introducing and explaining commodities by a host in a live broadcasting room; determining commodity attribute information of each commodity in the live broadcasting room according to the first key information;

The extraction module is used for receiving first voice data of a user in the vehicle and extracting target commodity attribute information from the first voice data;

and the second determining module is used for matching the target commodity attribute information with the commodity attribute information of each commodity to determine a target commodity corresponding to the target commodity attribute information.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed by the processor.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 6.