Echo Dot (3rd Gen) - Smart speaker with Alexa - Charcoal

Use your voice to play a song, artist, or genre through Amazon Music, Apple Music, Spotify, Pandora, and others. With compatible Echo devices in different rooms, you can fill your whole home with music.

Buy Now

Wireless Rechargeable Battery Powered WiFi Camera.

Wireless Rechargeable Battery Powered WiFi Camera is home security camera system lets you listen in and talk back through the built in speaker and microphone that work directly through your iPhone or Android Mic.

Buy Now

Amazon Comprehend now supports multi-label custom classification

0
118


Amazon Comprehend is a fully managed natural language processing (NLP) service that enables text analytics to extract insights from the content of documents. Amazon Comprehend supports custom classification and enables you to build custom classifiers that are specific to your requirements, without the need for any ML expertise. Previously, custom classification supported multi-class classification, which is used to assign a single label to your documents from a list of mutually exclusive labels. Starting January 6, custom classification also supports multi-label classification. With multi-label classification you can train models and classify your documents with more than one label.

For example, you can use multi-label classification to categorize customer contact transcripts with one or more labels to identify departments within your company like Payments, Renewals or Tech Support. These labels can then be mapped to relevant content in your support library or directed towards the appropriate contacts within your company.

In this post, let’s take a look at how to predict the subjects of an academic paper based on the abstract (data source: Yang et al. 2018. Sequence Generation Model for Multi-Label Classification). Custom classification is a two-step process. First you train a custom classifier to recognize the labels that are of interest to you. In the image below we’ve created a CSV file with abstracts and the applicable labels on each row:

Amazon Comprehend now supports multi-label custom classification 1

You can download a subsample of the dataset above in Comprehend supported input format at comprehend_multilabel.zip.

Next we train the classifier in the Amazon Comprehend console. We choose multi-label mode, point to the S3 location where the training data is stored and manage other settings. See detailed instructions in the developer guide:

Amazon Comprehend now supports multi-label custom classification 2

In the second step of custom classification, after Amazon Comprehend trains the classifier, you send unlabeled documents to be classified using the console or StartDocumentClassificationJob API. For our example, we will run an inference with a file that has one document per line:

Amazon Comprehend now supports multi-label custom classification 3

Amazon Comprehend now supports multi-label custom classification 4

Depending on whether you trained a multi-class or multi-label custom classifier, the classification API examines each document and returns either the specific label that best represents the content (multi-class) or the set of labels that best represent it (multi-label). For our analysis job, we get an output as shown below:

Amazon Comprehend now supports multi-label custom classification 5

Here’s a detailed look at one line:

Amazon Comprehend now supports multi-label custom classification 6

The output identifies all the subjects that apply to each abstract and their associated scores.

You can also create an endpoint with your custom multi-label classifier to enable real-time applications. Learn more about creating an endpoint for synchronous inference here.

Amazon Comprehend multi-label classification is now available in all AWS regions where Amazon Comprehend is available. To try the new feature, log in to the Amazon Comprehend console for a code-free experience, or download the AWS SDK. You can also learn more about this new feature in the documentation.

The dataset used in this post is a redacted, subsampled, and reformatted version of the AAPD dataset made available as part of Yang et al. 2018. Sequence Generation Model for Multi-Label Classification which is licensed under CC BY 4.0. A copy of the license is available here.

Amazon Comprehend now supports multi-label custom classification 7


About the Author

Amazon Comprehend now supports multi-label custom classification 8Sameer Karnik is a Sr. Product Manager leading product for Amazon Comprehend, AWS’s natural language processing service.

 

 

 

 



Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here