Thursday, May 14, 2026

3.5 Langchain Framework Demo - Build Your First Langchain Project: Stop Coding, Start Chaining: Build Your First AI App in 6 Steps

3.5 Langchain Framework Demo - Build Your First Langchain Project: Stop Coding, Start Chaining: Build Your First AI App in 6 Steps 

Prequisites

Stop Coding, Start Chaining: Build Your First AI App in 6 Steps

1. Introduction: The "Aha!" Moment of Modern AI

The current explosion of artificial intelligence can feel overwhelming. Every day, a new model or tool is released, making it seem like you need a PhD in Mathematics or a decade of software engineering experience just to participate.

However, the secret to modern AI development isn't complex calculus—it’s knowing how to use the right "LEGO bricks." LangChain and HuggingFace are those bricks. Think of HuggingFace as a massive, open-source library of ready-to-use AI brains, and LangChain as the connective tissue that snaps those brains together with your specific instructions. Together, they allow you to build sophisticated applications by simply orchestrating a flow of information.

2. Step 1: Setting Up Your Digital Laboratory

Before you can build, you must prepare your environment. Think of this as gathering your tools and clearing your workbench before starting a woodworking project. In the world of AI, this means calling in the "heavy hitters"—the software libraries that handle the complex communication between your computer and the AI.

Preparing Your Environment In your digital lab (like a Google Colab notebook), your first move is to install the essential packages. We use the following command to bring in the logic: pip install langchain-huggingface huggingface_hub

By installing these, you are providing your workspace with the specific vocabulary it needs to understand how to talk to the world’s most powerful language models.

3. Step 2: Securing Your "Golden Ticket" (API Keys)

To use the powerful models hosted on HuggingFace, you need a way to identify yourself to their servers. This is done through an API.

  • API (Application Programming Interface): A set of rules and connections that allows one piece of software to talk to another.

In this step, you set up your "digital passport"—the API key. This key tells HuggingFace that you have permission to use their models. Because this key is private, we don't just type it into the code where anyone can see it. Instead, we use a tool called getpass to hide your input, ensuring your "Golden Ticket" stays secure.

Security Warning: Accessing models requires a valid API token from your HuggingFace account settings. Always store this in an environment variable (like HUGGINGFACEHUB_API_TOKEN) rather than hard-coding it, to keep your account safe from "key theft."

4. Step 3: Choosing Your AI "Brain" with HuggingFaceHub

Once your lab is secure, you need to select a "brain" for your application. This is where we choose our LLM.

  • LLM (Large Language Model): A type of AI trained on massive amounts of text to understand and generate human-like language.

For our lab, we use a model called google/flan-t5-large. It is a fantastic choice for beginners because it is efficient, powerful, and specialized in following instructions. The game-changer here is that you don't have to "train" this AI yourself—a process that costs millions of dollars. You are simply "renting" the intelligence of a pre-trained model and putting it to work immediately.

5. Step 4: Mastering the Art of the Prompt Template

An AI is only as good as the instructions you give it. While you can "chat" with an AI, building a reliable application requires a Prompt Template.

A template allows you to create a structured instruction that the AI can follow every single time, even when the user’s question changes. This ensures that the AI’s responses remain consistent. It transforms the AI from a random chatterbox into a specialized tool.

The Code Structure: template = "Question: {question}\nAnswer:" prompt = PromptTemplate.from_template(template)

By using the {question} placeholder, you create a "slot" that your application can fill in dynamically, making your AI both smart and adaptable.

6. Step 5: The "Chain" that Brings it to Life

This is the core magic of the framework: the Chain.

  • Chain: A sequence of operations that links various AI components together to perform a task automatically.

In our workflow, the Chain acts as the "manager." It takes the user's input, plugs it into the Prompt Template, hands that completed instruction to the LLM (the brain), and then delivers the final answer back to you. This "chaining" process is the most surprising part for beginners; it automates the handover of data so you don't have to manually move text from one step to the next.

7. Step 6: The Moment of Truth (Execution)

The final step is the payoff. You invoke the chain by asking a question, such as "What is the capital of France?"

Seeing the AI respond—using your template and your selected model—is a powerful experience. It confirms that you have moved from being an AI consumer to an AI creator. In the lab, this looks like calling chain.invoke("Your Question Here"). Suddenly, the model processes your specific logic and returns a clean, intelligent response. You’ve just successfully navigated the entire lifecycle of a modern AI app.

8. Conclusion: Where Will You Build Next?

These six steps—setup, security, model selection, templating, chaining, and execution—are the foundation of almost every modern AI application on the market today. Whether you want to build a translator, a coding assistant, or a creative writing partner, the workflow remains exactly the same.

The beauty of LangChain and HuggingFace is that they remove the technical barriers, leaving only your creativity as the limit.

What kind of "chain" would you build if you could automate any conversational task in your daily life?

To see these steps in action and run the actual code yourself, head over to the Digital Lab (Google Colab) and start experimenting today!


Step 1: Set up the environment and install the required libraries and dependencies

In this step, you use pip to download the specific software packages needed for the project. By specifying version numbers (like ==0.1.17), you ensure that the libraries are compatible with each other and that the code runs exactly as the instructor intended.

The core installations include transformers for AI model access, accelerate for performance, sentencepiece for text processing, the langchain suite for building applications, and huggingface_hub for cloud connectivity.




# Step 1: Install dependencies.

# IMPORTANT: After this completes, if imports fail, go to 'Runtime' -> 'Restart session'

!pip install 

transformers==4.36.2 

accelerate==0.25.0 

sentencepiece 

langchain==0.1.17 langchain-core==0.1.52 langchain-community==0.0.36 

huggingface_hub


# Original


pip install ipykernel

python -m ipykernel install --user --name myenv --display-name "Python (myenv)"

pip install --upgrade pip setuptools wheel

pip install transformers==4.36.2 accelerate==0.25.0 sentencepiece

pip install torch --index-url https://download.pytorch.org/whl/cpu

pip install langchain==0.1.17 langchain-core==0.1.52 langchain-community==0.0.36

pip install huggingface_hub


This line of code is setting the foundation for your entire project. Here is a summary of exactly what each "tool" in that command does:

  • !pip install: This tells the environment (like Google Colab) to download and install specific software packages from the internet.

  • transformers==4.36.2: This is the core library from Hugging Face. it provides the actual architecture for models like Flan T5 Large so your code knows how to "talk" to the model.

  • accelerate==0.25.0: This library helps the AI run more efficiently. It automatically handles the hardware (like your GPU) so the model generates text faster and uses memory more effectively.

  • sentencepiece: This is a "tokenizer" tool. It breaks down your human sentences into smaller mathematical chunks (tokens) that the AI can understand.

  • langchain (and related packages): These are the "glue." LangChain allows you to take a raw AI model and turn it into a useful application by managing prompts and "chaining" different tasks together.

    • langchain-core (The Foundation): This is the base layer. It contains all the essential "rules" and basic building blocks (like how a prompt should look or how a model should behave) that everything else in the project depends on.

    • langchain-community (The Integrations): This box contains all the "connectors". If you want to connect LangChain to a specific database, a search engine, or a tool like Hugging Face, those specific pieces of code live here.

    • langchain (The Main Package): This is the user-facing part that ties everything together. It contains the high-level logic for things like Chains (sequences of actions) and Agents (AI that can make decisions).

  • huggingface_hub: This allows your script to securely log into the Hugging Face platform to download the model files using your API token.

  • The "Restart session" Note: When you install new libraries in Colab, the "brain" of the current session needs to be refreshed to recognize the new software you just added. If you don't restart, the computer might say it can't find the libraries you just installed.

The following 6 packages are identical in both versions, including their specific version numbers:

  • transformers==4.36.2: The main Hugging Face library for loading models.

  • accelerate==0.25.0: Used to speed up model performance.

  • sentencepiece: The tokenizer that helps the model read text.

  • langchain==0.1.17: The main framework for building AI applications.

  • langchain-core==0.1.52: The foundational logic for LangChain.

  • langchain-community==0.0.36: The connectors for third-party tools like Hugging Face.

  • huggingface_hub: This is in the instructor's list but was a separate line in your original. It does the same thing (connects to the model repository).

  1. torch: Your original had a specific line for a "CPU-only" version of PyTorch. The instructor likely leaves this out because Google Colab already has the faster "GPU-enabled" version pre-installed. Torch is not needed because Google Colab comes with it already pre-installed, so you don't have to download it manually.

  2. ipykernel / setuptools / wheel: These were only in your original version because they are needed for setting up a local computer. They aren't necessary for the instructor's demo in the cloud.

This is a streamlined installation script optimized for Google Colab that sets up a full Generative AI environment by installing the core Hugging Face model tools, memory-efficiency libraries, and the complete LangChain framework for building AI applications.


Step 2: Import all the required libraries

This step involves opening the "toolboxes" you just installed so their functions can be used in your code. You import warnings to keep the output clean and os to securely handle your API keys via environment variables.

Additionally, you load the transformers pipeline and essential LangChain components like LLMChain and PromptTemplate into a "try-except" block, which serves as a safety check to confirm everything was installed correctly before moving to the next phase.

Token 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



 

  • Line 1: import warnings

  • This line brings in Python’s built-in warnings library. Its purpose is to allow the script to manage and control the warning messages that Python or its installed libraries might trigger during execution.

  • Line 2: warnings.filterwarnings("ignore")

  • This line specifically tells the script to silence all warning messages. It is used in this demo to keep the output clean and professional by preventing non-critical technical warnings from cluttering the screen while you run your AI models.

  • Line 4: import os — This loads the "bridge" tool that allows your Python code to talk to the Google server's operating system.

  • Line 5: hf_token = "..." — This creates a variable to store your Hugging Face security key (your digital ID) so the script can use it.

  • Line 6: os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_token — This "sticks" your key onto the server's memory wall so that all your AI libraries can find it and log you in automatically.

  • Line 7: import torch — This imports the actual mathematical "engine" that powers the AI model's thinking and calculations.

  • Line 11: try:

  • This starts a "guarded" section. You are telling Python, "Try to run the following lines, but if something goes wrong, don't crash the whole program."

  • Lines 12–14: from langchain... import ...

  • These are the actual imports for your AI logic.

    • langchain==0.1.17 langchain-core==0.1.52 langchain-community==0.0.36 

    • from langchain_community.llms import HuggingFacePipeline  # (langchain-community)

    • from langchain.chains import LLMChain                   # (langchain)

    • from langchain.prompts import PromptTemplate            # (langchain-core)

    • Line 12: HuggingFacePipeline (langchain-community) — This tool serves as the connector that wraps your local or cloud-based Hugging Face model so it can be controlled by LangChain.

    • Line 13: LLMChain (langchain) — This is the primary orchestration tool that connects your Large Language Model (LLM) to a prompt, managing the logic of taking your question and getting back an AI answer.

    • Line 14: PromptTemplate (langchain-core) — This is the foundational blueprint used to create structured questions for the AI, allowing you to use variables like {question} to reuse the same prompt for different inputs.

  • Line 15: print("Libraries imported successfully!")

  • If lines 12–14 work perfectly, this message pops up to give you peace of mind that everything is ready.

  • Line 16: except ModuleNotFoundError:

  • This is the "Plan B." If Python looks for LangChain and can't find it (perhaps because the installation in Step 1 failed), it jumps straight to this line instead of showing a scary red error code.

  • Line 17: print("Error: langchain_community not found...")

  • This prints a friendly reminder telling you exactly what went wrong and how to fix it (by running the installation again).

  •  


 


Step 3 is where the heavy lifting starts. Here is the explanation for lines 1–7 of your newest attachment:

  • Line 1: model_name = "google/flan-t5-large" — Specifies the exact Google-trained AI model to download from the Hugging Face library.

  • Line 3: hf_pipeline = pipeline( — Creates an all-in-one "pipeline" object that bundles the model, its processor, and the generation logic together.

  • Line 4: task="text2text-generation", — Instructs the AI that its specific job is to take a text input (like a question) and produce a text output (the answer).

  • Line 5: model=model_name, — Links the pipeline to the Flan-T5 Large "brain" defined in the first line.

  • Line 6: device=-1 # CPU only — Forces the code to run on the server's central processor instead of a graphics card, ensuring stability in a standard Colab environment.

  • Line 7: ) — Closes the pipeline configuration command so the computer can begin the multi-gigabyte download shown in your screenshot.

In short: You are telling Google Colab to download a specific Google AI model and prepare it to handle text-to-text tasks using the server's CPU. The pipeline is a high-level object that configures the AI model and hardware device to execute a specific text-generation task.

The "green things" represent the successful download and assembly of your AI's brain. Here is what each status bar accomplished:

  • config.json: Downloaded the architectural "blueprint" that tells the software how to assemble the model's layers.

  • model.safetensors: Downloaded the primary 3.13GB file containing the actual "knowledge" and weights of the Flan-T5 model.

  • generation_config.json: Retrieved the default settings for how the AI should behave when creating text, such as maximum length or creativity levels.

  • tokenizer_config.json & tokenizer.json: Obtained the conversion tools that translate your typed words into numbers the AI can process.

  • spiece.model: Loaded the specific vocabulary file needed to break complex words into smaller, manageable pieces (tokens).

  • special_tokens_map.json: Downloaded the map for "hidden" characters that tell the AI where a sentence starts or ends.

You configured the pipeline object to manage the Flan-T5 model and CPU device for the specific task of text generation. Executing this code triggered the download of over 3GB of "green status" files, including the model's weights, architecture blueprints, and language tokenizers. This process successfully assembled all the necessary components into a ready-to-use AI "brain" on your Colab server.




In Step 4, you are fine-tuning the behavior of your AI "chef" and wrapping it in a LangChain uniform so it can work with the rest of your application.

The Breakdown of Step 4 (Lines 1–8)

  • Line 3: max_new_tokens = 512 — This sets the length limit. It tells the AI it can generate an answer up to 512 words/tokens long, ensuring it doesn't cut off mid-sentence.

    • Current (512): The AI has a "long leash." If you ask it to "Write a story about a cat," it can write several detailed paragraphs without getting cut off mid-thought.

    • Different Value (10): The AI has a "very short leash." If you ask the same cat story question, it might only say, "Once upon a time there was a small orange cat named..." and then stop abruptly.

  • Line 4: temperature = 0.7 — This controls creativity vs. logic. A 0.7 is a "sweet spot" that allows the AI to be flexible and conversational without becoming so "hot" that it starts making things up (hallucinating).

    • Current (0.7): The "Conversationalist." It is mostly logical but adds enough variety to feel human. For "What's a good snack?", it might suggest "A crisp apple with peanut butter".

    • Different Value (0 or 0.1): The "Robot." It becomes extremely literal and predictable. For the same snack question, it will almost always give the single most common answer, like "Fruit," every single time you ask.

    • Different Value (1.5): The "Wildcard." The AI becomes highly creative but chaotic. It might suggest "Sparkling neon crackers dipped in gravity".

  • Line 5: do_sample = True — This enables the randomness needed for the temperature setting to work. Without this, the AI would always pick the single most likely next word, making it very robotic and predictable.

    • Current (True): The "Brainstormer." It looks at all possible next words and picks one based on their probability. This allows the temperature setting to work its magic.

    • The Result: If you ask 10 times, you will likely get the exact same answer 10 times.

    • Example Output: "A good snack is a piece of fruit like an apple or a banana."

    • Different Value (False): The "Greedy Search." The AI ignores temperature entirely and only picks the word with the highest statistical probability. This is great for math or facts where there is only one "right" answer, but it makes creative writing repetitive and boring.

    • The Result: You get diversity and creativity; the answer might change every time you ask.

    • Example Output: "You might enjoy some air-popped popcorn with a dash of sea salt and nutritional yeast."

  • Line 8: llm = HuggingFacePipeline(...) — This is the wrapper. You are taking that raw Hugging Face pipeline from Step 3 and putting it inside a LangChain "container" so you can use it in Step 5’s chains.


The Instructor's Suggestions

The instructor mentioned two key "pro" moves for this step:

  1. The Temperature Experiment: He suggested trying different settings to see how the AI's "personality" changes.

    • Cold (0.1–0.3): Very literal and deterministic. Good for math or facts.

    • Warm (0.7): Balanced. Good for general questions.

    • Hot (0.9–1.0): Very creative. Good for brainstorming, but risky for facts.

  2. Using a GPU (The Accelerator): He pointed out that while you are using the CPU (device=-1) for this demo, real-world professionals use a GPU (Graphical Processing Unit) to make the AI respond significantly faster. In Colab, you can switch this under Runtime > Change runtime type, though it can consume your "compute units" quickly.

One-Sentence Summary: Step 4 configures your model's creativity and length limits before wrapping the entire pipeline in a LangChain interface so it can be easily managed in future steps.

Ready to see how we build the Prompt Template in Step 5?


Step 5 Build a Chain Using Langchain

In Step 5, you move from configuring the "brain" to designing the actual conversation flow and building the "connective tissue" that makes your AI application work.

The Breakdown of Step 5 (Lines 1–16)

  • Lines 5–9: template = """..."""

    • This is your Prompt Design. You are defining a structured "script" for the AI to follow.

    • By using the {question} placeholder, you create a reusable shell where any user input can be plugged in.

    • The phrase "Let's think step by step" is a well-known Chain of Thought technique that forces the AI to be more logical and accurate in its reasoning.

  • Lines 11–14: prompt = PromptTemplate(...)

    • This line takes your raw text script and turns it into a formal LangChain object.

    • It explicitly tells LangChain that the variable it needs to look for in your text is named "question".

  • Line 16: llm_chain = LLMChain(llm=llm, prompt=prompt)

    • This is the "The Link". You are finally "chaining" together the LLM (the brain you optimized in Step 4) with the Prompt (the script you just wrote).

    • The llm_chain object is now a complete, self-contained unit: you give it a question, and it knows exactly which model to use and what "personality" or format to use for the answer.


Layman’s Terms for Step 5

If Step 4 was setting the rules for your professional translator (how long they talk and how creative they are), Step 5 is giving them a standardized work form. You’ve designed the form with a specific "Question" box and told them to always start their work with a specific "Answer" style. By "chaining" them together, you’ve created a smooth assembly line where a customer drops off a question at one end, and the finished, formatted answer comes out the other.

One-Sentence Executive Summary: You engineered a structured PromptTemplate using Chain of Thought logic and orchestrated an LLMChain to bind your optimized model to this template for consistent, context-aware execution.

Ready to see the results in Step 6?


Step 6 Test and Run The Chain on Few Questions 

In Step 6, you finally put your assembly line to work and run your first "orders" through the system. Here is the breakdown:

The Breakdown of Step 6 (Questions 1 & 2)

  • Line 2: question1 = "..."

    • This is the Raw Input. You are defining the specific text you want the AI to process.

  • Line 3: answer1 = llm_chain.invoke({"question": question1})

    • This is the Execution. The invoke method sends your question into the LLMChain.

    • The chain takes that question, plugs it into the {question} slot of your PromptTemplate (from Step 5), and then hands the finished prompt to the Flan-T5 model (from Step 4).

  • Line 4: print("Answer 1:\n", answer1["text"], "\n")

    • This is the Output Display. Because LangChain returns a "dictionary" (a structured data packet), you tell it to specifically print just the "text" portion, which is the actual answer written by the AI.


The Instructor's Final "Pro" Suggestion

As you saw in the screenshot, the instructor highly recommends experimental testing at this stage:

  1. Change the Temperature: Go back to Step 4 and change the 0.7 to a 0.1 (cold/logical) or a 0.9 (hot/creative).

  2. Rerunning: Run the code in Step 4 again to update the "brain," and then come back to Step 6 and rerun the questions.

  3. Compare: You will notice that with a higher temperature, the AI might explain black holes using different metaphors each time, while a lower temperature will keep the answer very consistent and literal.

One-Sentence Executive Summary: Step 6 validates your entire orchestration by invoking the chain with real-world queries and programmatically extracting the AI-generated responses for review and comparison.

We've completed all 6 steps! You’ve successfully built a fully functioning, locally hosted AI application using the industry-standard LangChain framework. Do you want to try writing a custom question of your own to see how the model handles it?

Here are a few options for your tweet, ranging from "Deep Tech" to "Project Showcase," using the professional terminology we discussed:
**Option 1: The "Architect" (Focuses on the framework)**
Just deployed a local LLM orchestration pipeline! 🚀 Used #LangChain to wrap a Hugging Face Transformers pipeline, running the Flan-T5 model entirely on a local CPU. It’s all about building that connective tissue between raw model weights and structured PromptTemplates. #AI #Python #LLM
**Option 2: The "Optimizer" (Focuses on the settings)**
Hyperparameter tuning in action. 🛠️ Optimized my local inference engine by fine-tuning temperature (0.7) and do_sample logic to balance creative variance with logical consistency. Seeing some great results with Chain-of-Thought prompting! #GenerativeAI #NLP #HuggingFace
**Option 3: The "Builder" (Short & Punchy)**
Building locally > API calls. 💻 Integrated Hugging Face into a LangChain LLMChain to create a modular, production-ready inference pipeline. The best part? Complete control over the behavioral profile and token constraints. #BuildInPublic #AIInnovation
**Pro-tip for the tweet:** Tagging **@LangChainAI** or **@huggingface** often helps get your project in front of other developers!

No comments:

Post a Comment