Introduction to Machine Learning on AWS
Artificial intelligence is the larger field of study using computers to do processes generally done by the human brain.
Machine learning is a subset of artificial inteilligence where we work on algorithms that are able to learn from a set of training data.
Deep learning is a subset of machine learning that uses multilayer networks to build a model closer to a human brain.
All AI systems have three parts: 1) Training: supply training data to an algorithm 1) Model: Receive an output of the training process 1) Inference and hosting: use the model to make a prediction
All AWS services are exposed to each other via APIs that can be communicated via SDKs.
Amazon has a number of different ML services: 1) Amazon Rekognition: computer vision 1) Amazon Textract: automated text extraction and analysis 1) Amazon Comprehend: natural language processing 1) Amazon Transcribe: speech to text 1) Amazon Translate: machine langauge translation 1) Amazon Lex: chat bots and virtual agents 1) Amazon Sagemaker: build, train, and deploy your own machine learning models on managed infrastracture.
Amazon Rekegnition
- Fully managed deep-learning image-recognition service.
- Designed for scalability.
- Capabilities: Recognizes scenes, objects, concepts, faces.
Key Features:
- Uses pretrained models.
- APIs provide inferences from these models.
- Detects labels, faces, texts, and specific content like moderation labels and protective equipment in images.
API Examples:
- CompareFaces: Matches faces between a source and target image.
- DetectFaces: Identifies faces and provides details like age range, presence of smile, etc.
- DetectLabels: Identifies objects, events, concepts, with a confidence score.
- DetectModerationLabels: Finds potentially sensitive content.
- DetectProtectiveEquipment: Identifies face, hand, and head covers.
- DetectText: Recognizes text in an image.
- Celebrity Recognition: Identifies celebrities in images.
Video Content Analysis:
- Similar operations on stored video content.
- Video-specific APIs for segment detection, person tracking, etc.
Collections of Faces APIs:
- CreateCollection: Creates a collection of faces.
- IndexFaces: Populates a collection with detected faces.
- Search by Face ID or Image: Finds matching faces in a collection.
Custom Training and Custom Labels:
- Ability to train Rekognition with user’s labeled datasets.
- CreateDataset and DetectCustomLabels: User can create and use custom models for specific recognition tasks.
Applications:
- Image search indexing (e.g., vacation rental photos with beach scenes).
- Automating detection of missing parts in factory assembly lines.
- Brand logo recognition in online videos.
Amazon Textract
- Fully managed service for analyzing documents.
- Extracts text and structure, including handwritten documents.
- Built on AWS-trained models.
Available APIs:
- AnalyzeDocument: Analyzes specific features like tables, forms, queries.
- AnalyzeExpense: Processes expenses from receipts or invoices.
- DetectDocumentText: Identifies text within a document.
- AnalyzeID: Extracts information from identity documents like driver’s licenses.
- Asynchronous operation: Start API (begins process, returns job ID), Get API (retrieves status/results).
API Responses:
- AnalyzeDocument: Returns JSON data structure with blocks (page, line, word, key-value sets).
- AnalyzeExpense: Identifies vendor, sales information from expenses.
- DetectDocumentText: Provides blocks of text, lines, and words.
- AnalyzeID: Extracts details from identity documents.
Practical Use Case:
- Handwritten movie reviews: determine which API to use for extraction.
- Focus on extracting text and structured data from documents.
Amazon Comprehend
- Natural-language-processing machine learning service.
- Extracts insights and connections from text.
- Offers both pre-trained Amazon models and custom model training.
APIs for Language and Entity Detection:
- Detect-Dominant-Language: Real-time and batch processing for language detection.
- Batch-Detect-Dominant-Language: For batch analysis of up to 25 documents.
- Start-Dominant-Language-Detection-Job: Asynchronous batch processing with S3 input/output.
- Detect Named Entities: Identifies entities like persons, organizations in text.
- Detect Key Noun Phrases: Finds important noun phrases in text.
- Detect Personally Identifying Information: Identifies and redacts sensitive information.
Sentiment Analysis API:
- Detect-Sentiment: Determines text sentiment (e.g., positive, negative).
Custom Model APIs:
- Custom Classification: Classifies documents using custom labels.
- Custom Entity Recognizer: Recognizes custom-defined entities in text.
- Start-Document-Classification-Job and Start-Entities-Detection-Job: Asynchronous processing with custom models.
- Classify-Document and Detect-Entities: Real-time processing with purchased endpoints for custom models.
Additional APIs and Features:
- Event detection and topic detection.
-
Detailed AWS Documentation available for more information.
- Practical Use Case Scenario:
- Analyzing movie preview forms for audience sentiment.
- Integrating sentiment analysis with demographic analytics.
Amazon Transcribe
- Converts speech to text.
- Utilizes trained models for transcription subtleties.
- Supports 37 languages (as of publishing).
Transcription Complexity:
- Complex mapping from spoken word to text.
- Examples: Currency and date formats.
Transcription APIs:
- StartTranscriptionJob: Batch processing with media file input.
- GetTranscriptionJob: Retrieves results and status of transcription jobs.
- StartStreamTranscription: Real-time transcription for streaming media.
Improving Transcription Accuracy**:
- Custom Vocabulary: Enhance transcription of unique terms (e.g., brand names, technical terms).
- Custom-Language Model: Train with domain-specific text for context recognition.
Vocabulary Management:
- Vocabulary Filter: Masks or filters specific words (e.g., offensive language).
- Content Redaction: Automatic masking/removal of sensitive information (e.g., credit card numbers).
Additional Features:
- Language Identification: Detects language of the media file.
- Speaker Diarization: Identifies different speakers in a media file.
Specialized Use Cases:
- Medical and call-center analytics.
- Detailed information in AWS Documentation.
Practical Application:
- Suitable for diverse scenarios, from everyday conversations to technical and domain-specific dialogues.
Amazon Translate
- Service for text translation between different languages.
- Supports 75 languages (as of course creation).
APIs for Translation:
- TranslateText: Real-time text translation.
- StartTextTranslationJob: Asynchronous batch translation.
- DescribeTextTranslationJob: Checks status and results of batch translation jobs.
Customization Options:
- Formality setting: Adjusts translation for formal or informal language (supported in French, German, Hindi, Italian, Japanese, Spanish).
- Profanity replacement: Uses grawlix strings for profane words/phrases.
- Custom terminology: Control translations of specific terms, unidirectional or multidirectional.
- Non-translated text tags: Use tags like
<span>
or<p>
withtranslate=no
property to exclude text from translation. - Parallel data: Customize translation output for specific domains (e.g., life sciences, law, finance).
Practical Use:
- Simple translations with additional control over translation quality, style, and domain-specific language.
Amazon Lex
- Conversational interface supporting voice and text-chat.
- Utilizes Alexa’s conversational engine.
- Fully managed service with automatic scaling, backup, and versioning.
Conversation Example: Booking a car.
- Key elements: Destination city and departure date.
Bot Configuration in Amazon Lex:
- Intents: Tasks to complete via conversation (e.g., BookCar).
- Utterances: Phrases triggering an intent.
- Slots: Parameters required to fulfill an intent.
- Prompts: Questions eliciting slot responses.
Intent Configuration:
- PromptSpecification: Message confirming intent completion.
- DeclinationResponse: Used if user refuses promptSpecification.
- DialogCodeHook: AWS Lambda function for additional conversational logic.
- FulfillmentCodeHook: Passes completed intent to a Lambda function for processing or local device actions.
Slot Types:
- Defined questions for user response.
- EnumerationValues for training or validation.
- Two modes: Expand values (training data) or restrict to set values (validation).
Built-in Intents and Slot Types:
- Common intents (e.g., HelpIntent, CancelIntent) and slot types (e.g., dates, numbers) pre-configured.
Integration with External Systems:
- Customization of user interactions.
- Integration with databases or external systems via Lambda functions.
- DialogCodeHook: Initialization and validation.
- FulfillmentCodeHook: Post-intent processing and direction.
Lambda Function Integration:
- Receives context (bot, intent, slots) for processing.
- Directs Lex for next steps using dialog actions (e.g., Close, ConfirmIntent, Delegate, ElicitIntent, ElicitSlot).
Amazon SageMaker
- Designed for preparing, building, training, and deploying machine learning models.
- Offers fine control over model creation and deployment.
Jupyter Notebooks in SageMaker:
- Interactive environment for code, documentation, and visualizations.
- Supports data preparation, visualization, training, and deploying models.
- Can clone and run notebooks from GitHub.
Training a Model in SageMaker:
- Variety of algorithms available: SageMaker-provided, custom, or from AWS Marketplace.
- Custom algorithms via Docker containers.
- Selection of training instance types (standard, compute-optimized, GPU-accelerated).
- Hyperparameter tuning for algorithm optimization.
- Input data channels from S3 or EFS, format depends on the algorithm.
AWS Marketplace:
- Buy or sell trained models and algorithms.
- Marketplace offers pre-trained models or algorithms for training.
Hosting Models for Inferences:
- Real-Time Inference: API endpoint for real-time model access.
- Serverless Inference: Cost-effective for infrequent or unpredictable traffic.
- Asynchronous Endpoint: Queues requests for processing with large payloads/long processing times.
- Batch Transformation: Process data in batches without a long-lived endpoint.
Practical Applications:
- Real-time applications like fraud detection.
- Batch processing for non-real-time requirements.
Further Learning in SageMaker:
- Advanced machine learning capabilities for deeper exploration.