Kategorier: Alle - optimization - recommendation - adaptation - attention

av Eric Nic 20 dager siden

430

Product AI Service

Techniques for encoding text, including various positional and token embeddings, are crucial for effective natural language processing. Distributed training of large language models (

Product AI Service

QA Types

Special Types of QA

QA Types Based on Interaction

Multiple-Choice QA
Yes/No QA
Contextual QA (Clarification-Based)
Conversational QA

Methodological Distinctions (Based on Answer Generation)

Knowledge Based QA
Rule Based QA
Retrieval-Augmented QA
Extractive QA
Generative QA
Abstractive QA

UC -Data

Personalized Rec

Semi-Personalized Rec

Category Recommendation

Popularity Based Recommendation

Reviews Sentiment Labeling

Reviews Summarization

Tag Based Search

Tag Analysis and/or Generation

Similarity Search

Business Sectors

Manufacturing

Knowledge Management

Supply Chain Management

Human Resources and Talent Management

Research and Development

Healthcare

Education and Training

Regulatory Compliance

Customer Relationship Management (CRM)

Sales and Marketing

Finance and Banking

e-Business

Basic LLMs Tasks

Content Generation and Correction

Information Extraction

Text-to-Text Transformation

Semantic Search

Sentiment Analysis

Content Personalization

Ethical and Bias Evaluation

Paraphrasing

Language Translation

Text Summarization

Conversational AI

Question Answering

ML Scenarios & Tasks

Federated Learning

Meta-Learning (Learning to Learn)

Active Learning

Transfer Learning Involves leveraging knowledge from one task to improve learning in a related but different task. This is particularly useful when there is limited labeled data in the target domain.

Self-Supervised Learning A form of unsupervised learning where the data itself provides the supervision.

Multi-Task Learning Involves training a model on multiple related tasks simultaneously, sharing representations between tasks to improve generalization

Reinforcement Learning Involves training an agent to make a sequence of decisions by learning from interactions with an environment. The agent receives rewards or penalties and aims to maximize cumulative rewards. (game playing, robotics, and autonomous vehicles)

Semi-Supervised Learning This combines both labeled and unlabeled data to improve learning accuracy. It’s often used in cases where obtaining a large amount of labeled data is expensive or time-consuming.

Unsupervised Learning The model is given raw, unlabeled data and has to infer its own rules and structure the information.

Dimensionality Reduction
PCA, t-SNE
Clustering

Supervised Learning Uses labeled datasets to train algorithms to predict outcomes and recognize patterns

Regression
Classification
Binary MultiClass MultiLabel

AI/ML Projects Types

Domain-Specific

Innovation and R&D Projects
Technology and Software Development
Entertainment and Media
Public Safety
Agriculture
Education
Energy and Utilities
Transportation and Logistics
Insurance
E-commerce
Manufacturing and Logistics
Finance and Banking
Healthcare and Medicine

Technical Categorization

Speech Recognition and Audio Analysis
Generative AI (LLMs)
Text Mining and Natural Language Processing (NLP)
Signal
Supervised/Unsupervised
Time-Series forcasting

Strategic Categorization Organizational goals, market positioning and industry-specific needs

Training and Development
Social Impact and Sustainability
Data-Driven Decision Support
Product and Service Innovation
Risk Management and Compliance
Customer Experience Enhancement
Optimization and Efficiency Projects

AI Strategy

Traditional Models(CPU) Local (With Less Data, Predictive Modeling)

Pretrained Models (CPU) Local (Moderate Data, Predictive Modeling)

Some Use Cases
MLOps /CI-CD
Cost

Applying LLMs (Required Large Data Mostly QA)

LLMs on Cloud All UCs
All Use Cases
MLOps /CI-CD (Level 2)
API (e.g. OpenAI), Some UCs
Some Use Cases
AI Wow
Output Quality
Cost (Long Term)
Cost (Short Term)
Run-Time

Transformer

PaLM Family

U-PaLM

PaLM-E

PaLM2

PaLM

Flan-PaLM

Med-PaLM M

Med-PaLM2

Med-PaLM

Distributed LLM Training

Optimizer Parallelism: Focuses on partitioning optimizer state and gradients to reduce memory consumption on individual devices.

Model Parallelism Combines aspects of tensor and pipeline parallelism for high scalability but requires complex implementation.

Hybrid Parallelism Combine pipeline and tensor parallelism for optimal performance based on the model architecture and available resources.
Tensor Parallelism Shards a single tensor within a layer across devices, efficient for computation but requires careful communication management.
Pipeline Parallelism Divides the model itself into stages (layers) and assigns each stage to a different device, reduces memory usage but introduces latency.

Data Parallelism Replicates the entire model a cross devices, easy to implement but limited by memory constraints.

PEFT

Limited Computational Resource

Fine-Tuning II

Our Dataset is Different from the Pre-Trained Data

Fine-Tuning I

Large Labeled Dataset is Avaiable

Background

LLMs Adaptation Stages

Multi-Turn Instructions
Single-Turn Instructions
Reasoning in LLMs
In-context
Zero-Shot
Fine-Tuning
Instruction-tuning
Transfer Learning
Alignment-tuning

RLHF

Pre-Training

Language Modeling

Architecture

Attention in LLMs

LLM Essentials

Prompting

- Zero-Shot Prompting - In-context Learning - Single and Multi -Turn Instructions

Language Modeling

- Full Language Modeling - Prefix Language Modeling - Masked Language Modeling - Unified Language Modeling

Transformers Architectures

- Encoder Decoder : This architecture processes inputs through the encoder and passes the intermediate representation to the decoder to generate the output. - Causal Decoder : A type of architecture that does not have an encoder and processes and generates output using a decoder, where the predicted token depends only on the previous time steps -0 Prefix Decoder : where the attention calculation is not strictly dependent on the past information and the attention is bidirectional - Mixture-of-Experts: It is a variant of transformer architecture with parallel independent experts and a router to route tokens to experts.

Fine Tuning

- Instruction-tuning - Alignment-tuning - Transfer Learning

NLP Fundamentals

Encoding Positions
- Alibi - RoPE
Tokenization
- Wordpiece - Byte pair encoding (BPE) - UnigramLM

Attention In LLMs

- Self-Attention : Calculates attention using queries, keys, and values from the same block (encoder or decoder). - Cross Attention: It is used in encoder-decoder architectures, where encoder outputs are the queries, and key-value pairs come from the decoder. - Sparse Attention : To speedup the computation of Self-attention, sparse attention iteratively calculates attention in sliding windows for speed gains. - Flash Attention : To speed up calculating attention using GPUs, flash attention employs input tiling to minimize the memory reads and writes between the GPU high bandwidth memory (HBM) and the on-chip SRAM.

LLM Components

Adaptation

Decoding Strategies

Alignment

Fine-tuning and Instruction Tuning

Model Pre-training

LLM Architectures

Positional Encoding

Tokenizations

LLMs Cpabilities

Augmented

Interacting with users
Virtual acting
Assignment planning
Tool utilization
Task decomposition
Knowledge base utilization
Tool planning
Self-improvement
Self-refinement
Self-cirtisim

Emerging

Reasoning
Arithmetic
Symbolic
Common Sense
Logical
Instruction following
Few-shot
Turn based
Task definition
In-context learning
Pos/Neg example
Symbolic reference
Step by step solving
Comprehension
Reading Comprehension
Simplification
Summarization
Multilingual
Crosslingual QA
Crosslingual Tasks
Translation
World Knowledge
Understanding of global issues and challenges
Awareness of global economic and political systems
Knowledge of international law and policies
Familiarity with different cultures and societies
Understanding global events and trends
Coding
Continuous learning and updating coding skills
Good understanding of algorithms and data structures for effective coding
Coding knowledge enables LLMs to automate tasks and improve efficiency
Proficient in coding to enhance their capabilities
Programming languages are a key skill for LLMs to develop

- Masked Language Modeling - Causal Language Modeling - Next Sentence Prediction - Mixture of Experts

Text Embedding

Basic

- Supervised Fine-tuning - General Fine-tuning - Multi-turn Instructions - Instruction Following

- Decoder-Only - Encoder-Decoder - Hybrid

- Absolute Positional Embeddings - Relative Positional Embeddings - Rotary Position Embeddings - Relative Positional Bias

- BytePairEncoding - WordPieceEncoding - SentencePieceEncoding

Sentiment , 220000 Reviews seconds ~ hours

Product AI Service

Main topic

Time-Series Clustering

Time-Series Anomaly Detection

Natural Language Processing (NLP)
Computer Vision

Data Layer

Gathering Cleaning preparation
Settup ETL layer

Product Similarity

Project
Signal Processing
Anomaly Detection
Recommendation Systems
Predictive Modeling
Machine Learning Consulting and Optimization

Product Recommendation

Build
Personalized (User Interactions Available)
Semi-Personalized (Rich User Profile)
Popular Recommendation (User Login)

Product

User