Saturday, April 25, 2026

3.1 Intro to Gen AI Models - Gen AI: Models & Architecture

3.2 Intro to Gen AI Models - Gen AI: Models & Architecture

Sat, 25 Apr 26

Course Overview & Objectives

Unit 3.3: Generative AI Models & Architecture
Key learning goals:

Define gen AI principles and significance
Differentiate from traditional AI (CNN, RNN, LSTM)
Understand functioning of GANs, variational autoencoders, transformers

Gen AI Applications & Industry Impact

Current widespread usage across industries
Creative applications:

Marketing teams generating graphics/social media posts
DALL-E and multimodal models (Google Gemini nano)
Text-to-image/video generation

Automation & efficiency:

Customer support chatbots with RAG implementation
Vector databases storing company manuals/documentation
Grounded responses vs internet searches

Personalization:

Online retailer recommendation systems
Customer portfolio analysis for engagement

Industrial design:

Automotive component prototyping
Real estate/interior design with cost estimation
Material sourcing and supplier price comparison

Core Gen AI Definition & Mechanics

Generative AI creates new output from data + instructions using neural networks
Probabilistic generation:

Predicts most likely next word/token
Uses top-K or top-P probability distributions

Architecture foundation:

Based on transformer encoder-decoder structure
Decoder side handles generation
Trained on massive publicly available datasets

Neural Network Fundamentals

Basic structure: input layer → hidden layers → output layer
Processing occurs in hidden layers (not LLMs themselves)
Depth determined by number of hidden layers
Training process:

Forward pass: input processing
Backpropagation: weight updates via gradient descent
Learning rate controls step size (mountain descent analogy)

Traditional vs Generative AI Comparison

Traditional approaches:

Language translation: statistical machine translation
Art design: manual graphic design tools
Healthcare: limited data processing capacity

Generative advantages:

Parallel processing vs sequential
Multimodal capabilities
Massive data absorption and retention
Augments rather than replaces human capabilities

Neural Network Types & Applications

Feed-forward: single-direction processing
Convolutional (CNN): image processing
Recurrent (RNN): sequential/temporal data

LSTM: addresses long-term memory issues
GRU: simplified version with combined gates

Autoencoders: data compression to latent space

Principal component extraction
Variational autoencoders add probability for variations

Transformers: parallel processing with self-attention mechanism

Model Architecture Deep Dive

Self-attention mechanism:

Key building block of transformers
Enables parallel processing vs sequential
Tokens pay attention to all other tokens simultaneously

Generative Adversarial Networks (GANs):

Generator creates from noise
Discriminator evaluates against real data
Adversarial training improves quality

Autoencoder applications:

Denoising technology
Feature extraction
Image generation through latent space manipulation

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)