Today AWS announced new ways for you to easily add machine learning (ML) predictions to applications and business intelligence (BI) dashboards using relational data in your Amazon Aurora database and unstructured data in Amazon S3, by simply adding a few statements to your SQL (structured query language) queries and making a few clicks in Amazon QuickSight. Aurora, Amazon Athena, and Amazon QuickSight make direct calls to AWS ML services like Amazon SageMaker and Amazon Comprehend so you don’t need to call them from your application. This makes it more straightforward to add ML predictions to your applications without the need to build custom integrations, move data around, learn separate tools, write complex lines of code, or even have ML experience.
These new changes help make ML more usable and accessible to database developers and business analysts by making sophisticated ML predictions more readily available through SQL queries and dashboards. Formerly, you could spend days writing custom application-level code that must scale and be managed and supported in production. Now anyone who can write SQL can make and use predictions in their applications without any custom “glue code.”
Making sense of a world awash in data
AWS firmly believes that in the not-too-distant future, virtually every application will be infused with ML and artificial intelligence (AI). Tens of thousands of customers benefit from ML through Amazon SageMaker, a fully managed service that allows data scientists and developers the ability to quickly and easily build, train, and deploy ML models at scale.
While there are a variety of ways to build models, and add intelligence to applications through easy-to-use APIs like Amazon Comprehend for example, it can still be challenging to incorporate these models into your databases, analytics, and business intelligence reports. Consider a relatively simple customer service example. Amazon Comprehend can quickly evaluate the sentiment of a piece of text (is it positive or negative?). Suppose that I leave feedback on a store’s customer service page: “Your product stinks and I’ll never buy from you again!” It would be trivial for the store to run sentiment analysis on user feedback and contact me immediately to make things right. The data is available in their database and ML services are widely available.
The problem, however, lies in the difficulty of building prediction pipelines to move data between models and applications.
Developers have historically had to perform a large amount of complicated manual work to take these predictions and make them part of a broader application, process, or analytics dashboard. This can include undifferentiated, tedious application-level code development to copy data between different data stores and locations and transform data between formats, before submitting data to the ML models and transforming the results to use inside your application. Such work tends to be cumbersome and a poor way to use the valuable time of your developers. Moreover, moving data in and out of data stores complicates security and governance.
Putting machine learning in the hands of every developer
At AWS, our mission is clear: we aim to put machine learning in the hands of every developer. We do this by making it easier for you to become productive with sophisticated ML services. Customers of all sizes, including NFL, Intuit, AstraZeneca, and Celgene, rely on AWS ML services such as Amazon SageMaker and Amazon Comprehend. Celgene, for example, uses AWS ML services for toxicology prediction to virtually analyze the biological impacts of potential drugs without putting patients at risk. A model that previously took two months to train can now be trained in four hours.
While AWS offers the broadest and deepest set of AI and ML services, and though we introduced more than 200 machine learning features and capabilities in 2018 alone, we’ve felt that more is needed. Among these other innovations, one of the best things we can do is to enable your existing talent to become productive with ML.
And, specifically, developer talent and business analyst talent.
Though we offer services that improve the productivity of data scientists, we want to give the much broader population of application developers access to fully cloud-native, sophisticated ML services. Tens of thousands of customers use Aurora and are adept at programming with SQL. We believe it is crucial to enable you to run ML learning predictions on this data so you can access innovative data science without slowing down transaction processing. As before, you can train ML models against your business data using Amazon SageMaker, with the added ability to run predictions against those same models with one line of SQL using Aurora or Athena. This makes the results of ML models more accessible for a broad population of application developers.
Lead scoring is a good example of how this works. For example, if you build a CRM system on top of Aurora, you’ll store all of your customer relationship data, marketing outreach, leads, etc. in the database. As leads come from the website they’re moved to Aurora, and your sales team follows up on the leads convert them to customers.
But what if you wanted to help make that process more effective for your sales team? Lead scoring is a predictive model which helps qualify and rank incoming leads so that the sales team can prioritize which leads are most likely to convert to a customer sale, making them more productive. You can take a lead scoring model built by your data science team, or one that you have purchased on the AWS ML Marketplace, deploy it to Amazon SageMaker, and then order all your sales queues by priority based on the prediction from the model. Unlike in the past, you needn’t write any glue code.
Or you might want to use these services together to deliver on a next best offer use case. For example, your customer might phone into your call center complaining about an issue. The customer service representative successfully addresses the issue and proceeds to offer new products or services. How? Well, the representative can pull up a product recommendation on the Amazon QuickSight dashboard that shows multiple views and suggestions.
The first view shows product recommendations based on an Aurora query. The query pulls the customer profile, shopping history, and product catalog, and calls a model in Amazon SageMaker to make product recommendations. The second view is an Athena query that pulls customer browsing history or clickstream data from an S3 data lake, and calls an Amazon SageMaker model to make product recommendations. The third view is an Amazon QuickSight query that takes results from first and second view, calls an ensemble model in Amazon SageMaker, and makes the recommendations. You now have several offers to make based on different views of the customer, and all within one dashboard.
On the BI analyst side, we regularly hear from customers that it’s frustrating to have to build and manage prediction pipelines before getting predictions from a model. Developers currently spend days writing application-level code to move data back and forth between models and applications. You may now choose to deprecate your prediction pipelines and instead use Amazon QuickSight to visualize and report on all your ML predictions.
For application developers and business analysts, these changes make it more straightforward to add ML predictions to your applications without the need to build custom integrations, move data around, learn separate tools, write complex lines of code, or even have ML experience. Instead of days of developer labor, you can now add a few statements to your SQL queries and make a few clicks in Amazon QuickSight.
In these ways, we’re enabling a broader population of developers and data analysts to tap into the power of ML, with no Ph.D. required.
About the Author
Matt Asay (pronounced “Ay-see”) is a principal at AWS, and has spent nearly two decades working for a variety of open source and big data companies. You can follow him on Twitter (@mjasay).