Open In Colab

Lesson 5: Operationalize Machine Learning on AWS

Watch Lesson 5: Operationalize Machine Learning on AWS Video

Pragmatic AI Labs

alt text

This notebook was produced by Pragmatic AI Labs. You can continue learning about these topics by:

Load AWS API Keys

Put keys in local or remote GDrive:

cp ~/.aws/credentials /Users/myname/Google\ Drive/awsml/

Mount GDrive

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
Mounted at /content/gdrive

import os;os.listdir("/content/gdrive/My Drive/awsml")
['kaggle.json', 'credentials', 'config']

Install Boto

!pip -q install boto3

Create API Config

!mkdir -p ~/.aws &&\
  cp /content/gdrive/My\ Drive/awsml/credentials ~/.aws/credentials 

Test Comprehend API Call

import boto3
comprehend = boto3.client(service_name='comprehend', region_name="us-east-1")
text = "There is smoke in San Francisco"
comprehend.detect_sentiment(Text=text, LanguageCode='en')
{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
   'content-length': '160',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Thu, 22 Nov 2018 00:21:54 GMT',
   'x-amzn-requestid': '9d69a0a9-edec-11e8-8560-532dc7aa62ea'},
  'HTTPStatusCode': 200,
  'RequestId': '9d69a0a9-edec-11e8-8560-532dc7aa62ea',
  'RetryAttempts': 0},
 'Sentiment': 'NEUTRAL',
 'SentimentScore': {'Mixed': 0.008628507144749165,
  'Negative': 0.1037612184882164,
  'Neutral': 0.8582549691200256,
  'Positive': 0.0293553676456213}}

5.1 Understand ML Operations

Key Concepts

  • Monitoring
  • Security
  • Retraining Models
  • A/B Testing
  • TCO (Total Cost of Ownership)

MLOPS

  • Are you using a simple enough model?
  • Are you using the Data Lake or wired directly into production SQL DB?
  • Do you have alerts setup for prediction threshold failures?
  • Environments? Dev, Stage, Prod

5.2 Use Containerization with Machine Learning and Deep Learning

Key Concepts

docker_workflows

Amazon ECS (Elastic Container Service)

[Demo] ECS
  • Create a repo
  • List item

Amazon EKS (Kubernetes on AWS)

5.3 Implement continuous deployment and delivery for Machine Learning

Key Concepts

codebuild

[Demo] Code Build

  • buildspec.yml
  • console
  • build job
  • sync to s3
  • ECS integration

5.4 A/B Testing production deployments

Key Concepts

  • Sagemaker A/B Testing Capabilities
  • Deciding on ratio of delivery to ML Model

[Demo] Sagemaker A/B

5.5 Troubleshoot Production Deployment

Key Concepts

  • Using Cloudwatch
  • Searching Cloudwatch Logs
  • Alerting on key events
  • Using Auto-Scale Capabilities
  • Enterprise AWS Support

[Demo]Cloudwatch Features

5.6 Production Security

Key Concepts

  • Understanding KMS system (Encryption)
  • IAM Roles for Sagemaker
  • IAM Roles for VPC

[Demo] Sagemaker Security Features

5.7 Cost and Efficiency of ML Systems

Key Concepts

  • Understanding Spot Instances (show spot code)
  • Understanding Proper use of CPU vs GPU Resources
  • Scale up and Scale Down
  • Improve Time to Market
  • Choosing wisely from AI API vs “Do it Yourself”

[Demo] Spot Instances on AWS