Saturday, May 16, 2026

4.1 Prompting

 

4.1 Prompting 1

Sat, 16 May 26

Course Introduction: Building LLM Applications

  • New 8-lesson unit focused on practical LLM application building

  • Today’s lesson: Advanced prompt engineering techniques (part 1)

  • Previous coverage: LangChain, Python/JavaScript compatibility, workflows

AI Tutor Case Study Framework

  • Scenario: Interactive AI tutor for high school students

  • Key challenge: Adapting to individual learning styles and preferences

  • Focus on personalized education for students struggling with traditional methods

  • Will revisit personalization concepts throughout unit

Prompt Engineering Fundamentals

  • Definition: Iterative design, testing, and evaluation of LLM instructions

  • Goal: Optimize output quality without adding context to model

  • Leverages existing model capabilities through clear instructions

  • Process: Create → Test → Refine → Repeat until desired output achieved

Basic vs Advanced Prompting

  • Basic prompt example: “The ocean is”

    • Produces verbose, unfocused responses

    • Model uncertainty leads to token waste and higher costs

  • Improved prompt: “Complete the sentence: The ocean is”

    • More specific instruction yields better results

  • Business impact: Unclear prompts = higher costs + poor user experience

Prompt Elements Structure

  • Four key components for effective prompts:

    • Instruction: Clear directive/task for the model

    • Context: Additional grounding information

    • Input Data: Specific data points (dates, preferences, constraints)

    • Output Indicator: Format and structure requirements

  • Japan itinerary example progression:

    • Basic: “Create Japan itinerary for cherry blossom season”

    • Enhanced: Added solo traveler context, April 1-14 dates, cultural focus, flexible exploration periods

    • Result: More detailed, structured output with practical tips

LLM Parameter Settings

  • Temperature (0-1 scale):

    • Lower (0.2): More deterministic, constrained responses

    • Higher (0.8): More creative, exploratory outputs

    • Industry default: 0.7 (sweet spot between creativity and constraint)

    • Medical/financial use cases: Lower temperatures for compliance

  • Top P: Controls response breadth and diversity

  • Max Length: Token limit to control costs and verbosity

  • Stop Sequence: Defines where model stops generating (prevents endless output)

  • Frequency Penalty: Reduces repetition to save tokens and costs

Enterprise Considerations

  • Multi-tenancy token management

  • Token capping by subscription tier (Free/Pro/Enterprise)

  • Cost control through parameter optimization

  • Preventing resource over-utilization by single tenants

No comments:

Post a Comment