Categorias: Todos - search - interaction - recommendation

por Eric Nic 11 dias atrás

542

LLMs Learning Path

There are several advanced techniques and approaches used in artificial intelligence to optimize model performance and user interaction. Positional embeddings, including absolute, relative, and rotary types, are crucial for understanding the order of data.

LLMs Learning Path

Prompt Engineering

Optimization and Efficiency

Optimization by Prompting [Yang et al., 2023]

Code Generation and Execution

Chain of Code (CoC) Prompting [Li et al., 2023b]
Structured Chain-of-Thought (SCoT) Prompting [Li et al., 2023c]
Program of Thoughts (PoT) Prompting [Chen et al., 2022]

Fine-Tuning and Optimization

Automatic Prompt Engineer (APE) [Zhou et al., 2022]

User Interaction

Active-Prompt [Diao et al., 2023]

Reduce Hallucination

ReAct Prompting [Yao et al., 2022]
Retrieval Augmented Generation (RAG) [Lewis et al., 2020]

Reasoning and Logic

Graph-of-Thought (GoT) Prompting [Yao et al., 2023b]
Least-to-Most Prompting [Denny Zhou et al. 2023]
Tree-of-Thoughts (ToT) Prompting [Yao et al., 2023a]
Self-Consistency [Wang et al., 2022]
Automatic Chain-of-Thought (Auto-CoT) [Zhang et al., 2022]
Chain-of-Thought (CoT) Prompting [Wei et al., 2022]

New Tasks no Extensive Training

Few-shot Prompting [Brown et al., 2020]
Zero-shot Prompting [Radford et al., 2019]

Plan

Modeling Feat Set 3

Deploy Feat Set 3

Modeling Feat Set 2

Deploy Feat Set 2

Modeling Feat Set 1

Deploy Feat Set 1

Planning / Estimation AI Agent + Knowlege Extraction Maturity Level #1

Inference / Deploymnt Phase

Edge

Modeling/ Prototyping Phase

Open LLMs
Edge Decentralized
OpenAI Services
Edge

AI Agent System Abilities

Personal and Collaborative

Execution and Interaction

Planning and Decision Making

Perceiving and Predictive Modeling

Self-learning and Continuous Improvement

LLM-Based Agent

Action

Feedback Loop
Environment Interaction
Response Generation

Brain

Tool Interface
Knowledge Integration
Transferability & Generalization
Reasoning & Planning Layer
Memory Capabilities and Retrival
Core LLM Capabilities

Perception

Preprocessing
Input Modalities
Context Integration

LLM-Based Agent Barin LLM as a Main Part

QA Types

Special Types of QA

QA Types Based on Interaction

Multiple-Choice QA
Yes/No QA
Contextual QA (Clarification-Based)
Conversational QA

Methodological Distinctions (Based on Answer Generation)

Knowledge Based QA
Rule Based QA
Retrieval-Augmented QA
Extractive QA
Generative QA
Abstractive QA

UC -Data

Personalized Rec

Semi-Personalized Rec

Category Recommendation

Popularity Based Recommendation

Reviews Sentiment Labeling

Reviews Summarization

Tag Based Search

Tag Analysis and/or Generation

Similarity Search

Business Sectors

Manufacturing

Knowledge Management

Supply Chain Management

Human Resources and Talent Management

Research and Development

Healthcare

Education and Training

Regulatory Compliance

Customer Relationship Management (CRM)

Sales and Marketing

Finance and Banking

e-Business

Basic LLMs Tasks

Content Generation and Correction

Information Extraction

Text-to-Text Transformation

Semantic Search

Sentiment Analysis

Content Personalization

Ethical and Bias Evaluation

Paraphrasing

Language Translation

Text Summarization

Conversational AI

Question Answering

ML Scenarios & Tasks

Federated Learning

Meta-Learning (Learning to Learn)

Active Learning

Transfer Learning Involves leveraging knowledge from one task to improve learning in a related but different task. This is particularly useful when there is limited labeled data in the target domain.

Self-Supervised Learning A form of unsupervised learning where the data itself provides the supervision.

Multi-Task Learning Involves training a model on multiple related tasks simultaneously, sharing representations between tasks to improve generalization

Reinforcement Learning Involves training an agent to make a sequence of decisions by learning from interactions with an environment. The agent receives rewards or penalties and aims to maximize cumulative rewards. (game playing, robotics, and autonomous vehicles)

Semi-Supervised Learning This combines both labeled and unlabeled data to improve learning accuracy. It’s often used in cases where obtaining a large amount of labeled data is expensive or time-consuming.

Unsupervised Learning The model is given raw, unlabeled data and has to infer its own rules and structure the information.

Dimensionality Reduction
PCA, t-SNE
Clustering

Supervised Learning Uses labeled datasets to train algorithms to predict outcomes and recognize patterns

Regression
Classification
Binary MultiClass MultiLabel

AI/ML Projects Types

Domain-Specific

Innovation and R&D Projects
Technology and Software Development
Entertainment and Media
Public Safety
Agriculture
Education
Energy and Utilities
Transportation and Logistics
Insurance
E-commerce
Manufacturing and Logistics
Finance and Banking
Healthcare and Medicine

Technical Categorization

Speech Recognition and Audio Analysis
Computer Vision
Generative AI (LLMs)
Recommendation Systems
Text Mining and Natural Language Processing (NLP)
Predictive Modeling
Signal
Supervised/Unsupervised
Time-Series forcasting

Strategic Categorization Organizational goals, market positioning and industry-specific needs

Training and Development
Social Impact and Sustainability
Data-Driven Decision Support
Product and Service Innovation
Risk Management and Compliance
Customer Experience Enhancement
Optimization and Efficiency Projects

AI Strategy

Traditional Models(CPU) Local (With Less Data, Predictive Modeling)

Pretrained Models (CPU) Local (Moderate Data, Predictive Modeling)

Some Use Cases
MLOps /CI-CD
Cost

Applying LLMs (Required Large Data Mostly QA)

LLMs on Cloud All UCs
All Use Cases
MLOps /CI-CD (Level 2)
API (e.g. OpenAI), Some UCs
Some Use Cases
AI Wow
Output Quality
Cost (Long Term)
Cost (Short Term)
Run-Time

Transformer

PaLM Family

U-PaLM

PaLM-E

PaLM2

PaLM

Flan-PaLM

Med-PaLM M

Med-PaLM2

Med-PaLM

Distributed LLM Training

Optimizer Parallelism: Focuses on partitioning optimizer state and gradients to reduce memory consumption on individual devices.

Model Parallelism Combines aspects of tensor and pipeline parallelism for high scalability but requires complex implementation.

Hybrid Parallelism Combine pipeline and tensor parallelism for optimal performance based on the model architecture and available resources.
Tensor Parallelism Shards a single tensor within a layer across devices, efficient for computation but requires careful communication management.
Pipeline Parallelism Divides the model itself into stages (layers) and assigns each stage to a different device, reduces memory usage but introduces latency.

Data Parallelism Replicates the entire model a cross devices, easy to implement but limited by memory constraints.

PEFT

Limited Computational Resource

Fine-Tuning II

Our Dataset is Different from the Pre-Trained Data

Fine-Tuning I

Large Labeled Dataset is Avaiable

Background

LLMs Adaptation Stages

Multi-Turn Instructions
Single-Turn Instructions
Reasoning in LLMs
In-context
Zero-Shot
Fine-Tuning
Instruction-tuning
Transfer Learning
Alignment-tuning

RLHF

Pre-Training

Language Modeling

Architecture

Attention in LLMs

LLM Essentials

Prompting

- Zero-Shot Prompting - In-context Learning - Single and Multi -Turn Instructions

Language Modeling

- Full Language Modeling - Prefix Language Modeling - Masked Language Modeling - Unified Language Modeling

Transformers Architectures

- Encoder Decoder : This architecture processes inputs through the encoder and passes the intermediate representation to the decoder to generate the output. - Causal Decoder : A type of architecture that does not have an encoder and processes and generates output using a decoder, where the predicted token depends only on the previous time steps -0 Prefix Decoder : where the attention calculation is not strictly dependent on the past information and the attention is bidirectional - Mixture-of-Experts: It is a variant of transformer architecture with parallel independent experts and a router to route tokens to experts.

Fine Tuning

- Instruction-tuning - Alignment-tuning - Transfer Learning

NLP Fundamentals

Encoding Positions
- Alibi - RoPE
Tokenization
- Wordpiece - Byte pair encoding (BPE) - UnigramLM

Attention In LLMs

- Self-Attention : Calculates attention using queries, keys, and values from the same block (encoder or decoder). - Cross Attention: It is used in encoder-decoder architectures, where encoder outputs are the queries, and key-value pairs come from the decoder. - Sparse Attention : To speedup the computation of Self-attention, sparse attention iteratively calculates attention in sliding windows for speed gains. - Flash Attention : To speed up calculating attention using GPUs, flash attention employs input tiling to minimize the memory reads and writes between the GPU high bandwidth memory (HBM) and the on-chip SRAM.

LLM Components

Adaptation

Decoding Strategies

Alignment

Fine-tuning and Instruction Tuning

Model Pre-training

LLM Architectures

Positional Encoding

Tokenizations

LLMs Cpabilities

Augmented

Interacting with users
Virtual acting
Assignment planning
Tool utilization
Task decomposition
Knowledge base utilization
Tool planning
Self-improvement
Self-refinement
Self-cirtisim

Emerging

Reasoning
Arithmetic
Symbolic
Common Sense
Logical
Instruction following
Few-shot
Turn based
Task definition
In-context learning
Pos/Neg example
Symbolic reference
Step by step solving

Basic

Comprehension
Reading Comprehension
Simplification
Summarization
Multilingual
Crosslingual QA
Crosslingual Tasks
Translation
World Knowledge
Understanding of global issues and challenges
Awareness of global economic and political systems
Knowledge of international law and policies
Familiarity with different cultures and societies
Understanding global events and trends
Coding
Continuous learning and updating coding skills
Good understanding of algorithms and data structures for effective coding
Coding knowledge enables LLMs to automate tasks and improve efficiency
Proficient in coding to enhance their capabilities
Programming languages are a key skill for LLMs to develop

- Masked Language Modeling - Causal Language Modeling - Next Sentence Prediction - Mixture of Experts

Text Embedding

- Supervised Fine-tuning - General Fine-tuning - Multi-turn Instructions - Instruction Following

- Decoder-Only - Encoder-Decoder - Hybrid

- Absolute Positional Embeddings - Relative Positional Embeddings - Rotary Position Embeddings - Relative Positional Bias

- BytePairEncoding - WordPieceEncoding - SentencePieceEncoding

Sentiment , 220000 Reviews seconds ~ hours

LLMs Learning Path

Agentic CoQA

https://github.com/anair123/Llama2-Powered-QA-Chatbot-For-Research-Papers

Langchain

https://huggingface.co/learn/cookbook/en/advanced_rag
https://huggingface.co/learn/cookbook/en/rag_zephyr_langchain

Promp-Engineering

Fine-Tuning, PEFT, Quantazations

Low Latency Deployment Chatbot

GPT4All
Ollama
HF
Jan

Fine-Tuning (QA Chatbot)

Zephyr
SmolLM2 https://github.com/huggingface/smollm/tree/main
Mistral 7b
LLama 3.2
Gemma-2

Fundamentals

From Seq-to-Seq and RNN to Attention and Transformers
Attention Variants Papers
Vasilev :Chapters 2, 3, 6, 7, 8