AI discussions¶
The goal of this section is to get a set of content to support deeper discussions around Gen AI, during chit-chat or interviews.
See also the structured learning path in Guide for AI (Phases 4–5 cover Gen AI and agentic systems).
1. Explain LLM fundamentals¶
Cover the high-level workings of models like GPT-3, including transformers, pre-training, fine-tuning, etc.
- General LLM introduction
- Transformer and GPT-3 summary
- How LLM pre-training is done
- How to fine tune existing model
- How RAG works
- Embeddings and NLP processing
- Inference parameters (temperature, top-p)
Code samples (src/)
| Example | Path | Purpose |
|---|---|---|
| OpenAI API client | openai_api.py | Direct OpenAI API usage without LangChain. See OpenAI readme |
| First LangChain chain | first_openai_lc.py | Basic chain; illustrates knowledge cut-off / hallucination |
| OpenAI retrieval | openai_retrieval_lc.py | Crawl docs, FAISS vector store, retriever chain |
| HuggingFace embeddings | embeddings_hf.py | Embedding model usage |
| Bedrock access | test_bedrock_access.py | AWS Bedrock client setup |
| Ollama local LLM | llama_lc.py | Local model via Ollama. See Ollama readme |
2. Discuss prompt engineering¶
Talk through techniques like demonstrations, examples, and plain language prompts to optimize model performance.
- Prompt Engineering
- Zero-shot and few-shot prompting
- Chain of Thought (CoT)
- Prompt chaining and Tree of Thoughts
- Automatic Prompt Engineering (APE)
- ReAct prompting
Prompt engineering code (src/ and e2e-demos/)
| Example | Path | Purpose |
|---|---|---|
| CoT with Bedrock | test_bedrock_cot.py | Chain-of-thought prompts; sample inputs in cot3.txt |
| Program-Aided Language | test_pal_with_claude.py | PAL prompting with Claude on Bedrock |
| Prompt builder graph | prompt_builder_graph.py | LangGraph prompt construction |
| Customer response | response_to_unhappy_customer.py | Domain-specific prompt for text generation |
| Model evaluation | evaluation.py | Assess and compare prompt / model outputs |
| Critical thinking prompt | think_deeply/ | 5-stage thinking framework demo. See e2e readme |
3. Share LLM project examples¶
Walk through hands-on experiences leveraging models like GPT-3, Langchain, or Vector Databases.
- Review RAG positioning, architecture
- Advanced RAG techniques (multi-query, fusion, HyDE)
- LangChain overview
- LangGraph agent patterns
RAG and Q&A projects
| Example | Path | Purpose |
|---|---|---|
| Streamlit RAG demo | qa_retrieval/Main.py | RAG impact on response quality using Lilian Weng agent blog |
| Build vector store | build_agent_domain_rag.py | Index Lilian Weng multi-agent blog into ChromaDB |
| Query domain store | query_agent_domain_store.py | Chat against persisted vector store |
| Multiple queries RAG | multiple_queries_rag.py | Multi-query retrieval expansion |
| RAG fusion | rag_fusion.py | Reciprocal rank fusion across queries |
| RAG HyDE | rag_hyde.py | Hypothetical document embeddings |
| Adaptive RAG | adaptive_rag.py | LangGraph adaptive retrieval routing |
| QA pipeline | qa-pipeline.py | Bedrock + ChromaDB Q&A chain |
| Chat with PDF | chat_with_pdf/app.py | End-to-end PDF Q&A app |
| RAG readme | rag/README.md | Index of all RAG examples |
Agentic project examples
| Example | Path | Purpose |
|---|---|---|
| Agno overview | agno.md | Agno SDK patterns and documentation |
| Agno studies | src/agentic/agno/ | Agents with Ollama, tools, knowledge, workflows |
| Deep researcher | deep_researcher/ | Multi-agent research workflow. See README |
| LLM wiki agent | llm-wiki/ | RAG over a personal wiki with Agno |
| ReAct LangGraph | react_lg.py | ReAct agent with chain-of-thought prompt |
| Agent with tools | openai_agent.py | Tool calling with retriever and Tavily search |
4. Stay updated on research¶
Mention latest papers and innovations in few-shot learning, prompt tuning, chain of thought prompting, etc.
- Few-shot and zero-shot prompting — in-context learning without weight updates
- Instruction tuning — fine-tuning on task instructions rather than per-task datasets
- Chain of Thought — intermediate reasoning steps; code in test_bedrock_cot.py
- Tree of Thoughts — search over reasoning paths
- Agentic AI — planning, memory, tools, multi-agent patterns
- Agno framework notes
- Model Context Protocol (MCP) — standardized tool integration for assistants
- Hermes agent notes
5. Dive into model architectures¶
Compare transformer networks like GPT-3 vs Codex. Explain self-attention, encodings, model depth, etc.
- Transformer architecture — attention, embeddings, positional encoding, decoder-only vs encoder-decoder
- Encoder-decoder and generative models — three transformer types
- Transfer learning and fine-tuning — reusing pre-trained weights (CV and NLP)
- Deep learning foundations — CNNs, training loops, pre-trained model usage
- PyTorch computer vision — CNN training with fashion_cnn.py
- Distributed training (DDP) — multi-GPU training; references minGPT fine-tuning
6. Work with Skills¶
Skills package reusable agent capabilities (prompts, tools, workflows) for AI coding assistants and orchestration platforms.
- At startup, an agent only reads a tiny piece of metadata (the skill's description). It doesn't load the heavy instructions or assets until it explicitly decides the skill is relevant to the user's task.
-
A standard skill is packaged as a self-contained folder:
-
The
SKILL.mdfile includes:- YAML Frontmatter (The Metadata): Located at the very top of the file. You must provide a crisp
nameand a hyper-focuseddescription. Treat this description like regex for the agent's brain—it acts as the trigger condition (Zhang, n.d.). - The Procedural Body: Written in standard Markdown. This is where you lay out the multi-step, phased workflows, conditional logic, and specific tool execution expectations for the agent (Chen, n.d.).
- Specification Tip: If your description is too broad, the agent will trigger it mistakenly; if it's too narrow, the agent won't reuse it when a slightly different task arises. Striking a balance is key.
- YAML Frontmatter (The Metadata): Located at the very top of the file. You must provide a crisp
- Implement Execution Best Practices:
- If a step in your workflow requires zero improvisation (e.g., parsing a specific CSV format or calling an internal API), do not write natural language instructions for it. Put it in a script inside the
/scriptsdirectory and instruct the agent to execute it. - Prevent Plan Drift: Under small variations, language models inherently want to change up step orders or alter tool arguments. Use clear, sequential step boundaries in your
SKILL.mdto force the agent into a predictable execution loop: Gather context -> Take action -> Verify results. - As agents advance, they often branch into parallel sub-agents. Ensure your skill scripts do not cause state collisions if invoked simultaneously.
- If a step in your workflow requires zero improvisation (e.g., parsing a specific CSV format or calling an internal API), do not write natural language instructions for it. Put it in a script inside the
-
Validation & Continuous Evaluation:
- Use basic linting to ensure your YAML frontmatter fields are complete and structure layouts comply with standard skill formats.
- Transition to running paired simulation trials where you evaluate agent trajectories with the skill versus a baseline without the skill. This is how you objectively measure skill checking if it genuinely improves efficiency, accuracy, and safety.
-
Claude / Cursor agent skills — SKILL.md format,
.cursor/skills/setup -
"Skills Are the New Apps– Now It’s Time for Skill OS" - Le Chen and co
- Agentic Continuous Evaluation of Skills (ACES) kevin C
- SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization
7. Discuss fine-tuning techniques¶
Explain supervised fine-tuning, parameter efficient fine tuning, few-shot learning, and other methods to specialize pre-trained models for specific tasks.
- Model fine-tuning overview — when to fine-tune vs RAG vs prompt engineering
- Transfer learning — load pre-trained weights, adapt to custom data
- NLP embeddings and fine-tuning — document-level embedding training
- Few-shot prompting — in-context examples without weight updates
- Instruction tuning — alignment via instruction datasets (RLHF)
- RAG vs fine-tuning tradeoffs — cost, quality, and skill requirements
- Resume tailoring demo — prompt-based specialization to a job description (not weight fine-tuning, but illustrates task adaptation)
8. Demonstrate production engineering expertise¶
From tokenization to embeddings to deployment, showcase your ability to operationalize models at scale, and monitoring model inference.
- OpenAI streaming and deployment — API patterns for production
- Feature stores — Feast and FeatureForm for ML serving
- Methodology for AI projects — scoping, risk, and team skills
Production-oriented examples
| Example | Path | Purpose |
|---|---|---|
| FastAPI streaming server | web_server_wt_streaming.py | Streaming chain/agent behind HTTP |
| Async streaming | async_stream.py | Async OpenAI streaming client |
| Model evaluation | evaluation.py | Output quality assessment |
| AgentOS | first_agent_os.py | Agno production agent server |
| LiteLLM proxy + Prometheus | claude_code_liteLLM/ | LLM gateway with monitoring. See README |
| Streaming demo | streaming-demo/ | LangGraph streaming UI |
| Feast feature store | feast/ | Feature repo and serving examples |
| FeatureForm | featureform/ | Feature definitions and training pipeline |
| DDP multi-GPU | multi_gpu_ddp.py | Distributed PyTorch training |