Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Agents

AI agents are autonomous systems that use LLMs to reason, plan, and take actions to achieve specific goals. They combine language understanding with tool usage, memory, and decision-making to perform complex, multi-step tasks. Agents can interact with external APIs and services while maintaining context across interactions.
AI agents are autonomous systems that use LLMs to reason, plan, and take actions to achieve specific goals. They combine language understanding with tool usage, memory, and decision-making to perform complex, multi-step tasks. Agents can interact with external APIs and services while maintaining context across interactions.

Visit the following resources to learn more:

- [@official@Tool use overview - Anthropic](https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview)
- [@article@Introduction to AI Agents - DAIR.AI](https://www.promptingguide.ai/agents/introduction)
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# AI Red Teaming

AI red teaming involves deliberately testing AI systems to find vulnerabilities, biases, or harmful behaviors through adversarial prompting. Teams attempt to make models produce undesired outputs, bypass safety measures, or exhibit problematic behaviors. This process helps identify weaknesses and improve AI safety and robustness before deployment.
AI red teaming involves deliberately testing AI systems to find vulnerabilities, biases, or harmful behaviors through adversarial prompting. Teams attempt to make models produce undesired outputs, bypass safety measures, or exhibit problematic behaviors. This process helps identify weaknesses and improve AI safety and robustness before deployment.

Visit the following resources to learn more:

- [@official@Define success and build evaluations - Anthropic](https://platform.claude.com/docs/en/test-and-evaluate/develop-tests)
- [@official@OWASP Top 10 for LLM Applications 2025](https://genai.owasp.org/llmrisk/)
- [@opensource@Microsoft PyRIT - Risk Identification for GenAI](https://github.com/microsoft/PyRIT)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# AI vs AGI

AI (Artificial Intelligence) refers to systems that perform specific tasks intelligently, while AGI (Artificial General Intelligence) represents hypothetical AI with human-level reasoning across all domains. Current LLMs are narrow AI - powerful at language tasks but lacking true understanding or general intelligence like AGI would possess.
AI (Artificial Intelligence) refers to systems that perform specific tasks intelligently, while AGI (Artificial General Intelligence) represents hypothetical AI with human-level reasoning across all domains. Current LLMs are narrow AI - powerful at language tasks but lacking true understanding or general intelligence like AGI would possess.

Visit the following resources to learn more:

- [@article@Artificial general intelligence - Wikipedia](https://en.wikipedia.org/wiki/Artificial_general_intelligence)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Anthropic

Anthropic created Claude, a family of large language models known for safety features and constitutional AI training. Claude models excel at following instructions, maintaining context, and avoiding harmful outputs. Their strong instruction-following capabilities and built-in safety measures make them valuable for reliable, ethical AI applications.
Anthropic develops Claude, a family of large language models focused on safety and helpfulness. The current lineup includes Claude Opus 4.7 (most capable, for complex reasoning and agentic coding), Claude Sonnet 4.6 (best speed-intelligence balance), and Claude Haiku 4.5 (fastest, near-frontier intelligence). All models support extended thinking, vision, and 1M token context windows.

Visit the following resources to learn more:

- [@official@Claude API Documentation](https://docs.anthropic.com/en/docs/intro)
- [@official@Anthropic Research](https://www.anthropic.com/research)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Automatic Prompt Engineering

Automatic Prompt Engineering (APE) uses LLMs to generate and optimize prompts automatically, reducing human effort while enhancing model performance. The process involves prompting a model to create multiple prompt variants, evaluating them using metrics like BLEU or ROUGE, then selecting the highest-scoring candidate. For example, generating 10 variants of customer order phrases for chatbot training, then testing and refining the best performers. This iterative approach helps discover effective prompts that humans might not consider, automating the optimization process.
Automatic Prompt Engineering (APE) uses LLMs to generate and optimize prompts automatically, reducing human effort while enhancing model performance. The process involves prompting a model to create multiple prompt variants, evaluating them using metrics like BLEU or ROUGE, then selecting the highest-scoring candidate. For example, generating 10 variants of customer order phrases for chatbot training, then testing and refining the best performers. This iterative approach helps discover effective prompts that humans might not consider, automating the optimization process.

Visit the following resources to learn more:

- [@article@Automatic Prompt Engineer - DAIR.AI](https://www.promptingguide.ai/techniques/ape)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Calibrating LLMs

Calibrating LLMs involves adjusting models so their confidence scores accurately reflect their actual accuracy. Well-calibrated models express appropriate uncertainty - being confident when correct and uncertain when likely wrong. This helps users better trust and interpret model outputs, especially in critical applications where uncertainty awareness is crucial.
Calibrating LLMs involves adjusting models so their confidence scores accurately reflect their actual accuracy. Well-calibrated models express appropriate uncertainty - being confident when correct and uncertain when likely wrong. This helps users better trust and interpret model outputs, especially in critical applications where uncertainty awareness is crucial.

Visit the following resources to learn more:

- [@article@Calibrating LLMs - LearnPrompting](https://learnprompting.org/docs/reliability/calibration)
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,7 @@ Chain of Thought prompting improves LLM reasoning by generating intermediate rea

Visit the following resources to learn more:

- [@article@Chain-of-Thought Prompting - DAIR.AI](https://www.promptingguide.ai/techniques/cot)
- [@article@Chain-of-Thought Prompting - LearnPrompting](https://learnprompting.org/docs/intermediate/chain_of_thought)
- [@article@Reasoning LLMs Guide - DAIR.AI](https://www.promptingguide.ai/guides/reasoning-llms)
- [@video@Context Engineering vs. Prompt Engineering: Smarter AI with RAG & Agents](https://youtu.be/vD0E3EUb8-8?si=Y6MCLPzjmhMB4jSu&t=203)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Context Window

Context window refers to the maximum number of tokens an LLM can process in a single interaction, including both input prompt and generated output. When exceeded, older parts are truncated. Understanding this constraint is crucial for prompt engineering—you must balance providing sufficient context with staying within token limits.
Context window refers to the maximum number of tokens an LLM can process in a single interaction, including both input prompt and generated output. When exceeded, older parts are truncated. Understanding this constraint is crucial for prompt engineering—you must balance providing sufficient context with staying within token limits.

Visit the following resources to learn more:

- [@official@Context windows - Anthropic](https://platform.claude.com/docs/en/build-with-claude/context-windows)
- [@article@What is a context window? - IBM](https://www.ibm.com/think/topics/context-window)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[@video@What is a Context Window? Unlocking LLM Secrets](https://youtu.be/-QVoIxEpFkM?si=Ut-ScxhUdS0JjRmb)

Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Contextual Prompting

Contextual prompting provides specific background information or situational details relevant to the current task, helping LLMs understand nuances and tailor responses accordingly. Unlike system or role prompts, contextual prompts supply immediate, task-specific information that's dynamic and changes based on the situation. For example: "Context: You are writing for a blog about retro 80's arcade video games. Suggest 3 topics to write articles about." This technique ensures responses are relevant, accurate, and appropriately framed for the specific context provided.
Contextual prompting provides specific background information or situational details relevant to the current task, helping LLMs understand nuances and tailor responses accordingly. Unlike system or role prompts, contextual prompts supply immediate, task-specific information that's dynamic and changes based on the situation. For example: "Context: You are writing for a blog about retro 80's arcade video games. Suggest 3 topics to write articles about." This technique ensures responses are relevant, accurate, and appropriately framed for the specific context provided.

Visit the following resources to learn more:

- [@official@Prompting Best Practices - Anthropic](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices)
- [@article@Prompt Structure and Key Parts - LearnPrompting](https://learnprompting.org/docs/basics/prompt_structure)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Fine-tuning vs Prompt Engineering

Fine-tuning trains models on specific data to specialize behavior, while prompt engineering achieves customization through input design without model modification. Prompt engineering is faster, cheaper, and more accessible. Fine-tuning offers deeper customization but requires significant resources and expertise.
Fine-tuning trains models on specific data to specialize behavior, while prompt engineering achieves customization through input design without model modification. Prompt engineering is faster, cheaper, and more accessible. Fine-tuning offers deeper customization but requires significant resources and expertise.

Visit the following resources to learn more:

- [@article@When to use prompt engineering vs. fine-tuning - TechTarget](https://www.techtarget.com/searchEnterpriseAI/tip/Prompt-engineering-vs-fine-tuning-Whats-the-difference)
- [@article@Prompt Engineering vs Fine Tuning: When to Use Each - Codecademy](https://www.codecademy.com/article/prompt-engineering-vs-fine-tuning)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Frequency Penalty

Frequency penalty reduces token probability based on how frequently they've appeared in the text, with higher penalties for more frequent tokens. This prevents excessive repetition and encourages varied language use. The penalty scales with usage frequency, making overused words less likely to be selected again, improving content diversity.
Frequency penalty reduces token probability based on how frequently they have appeared in the text, with higher penalties for more frequent tokens. This prevents excessive repetition and encourages varied language use. The penalty scales with usage frequency, making overused words less likely to be selected again, improving content diversity.

Visit the following resources to learn more:

- [@article@Frequency Penalty - LLM Parameter Guide - Vellum](https://www.vellum.ai/llm-parameters/frequency-penalty)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Google

Google develops influential LLMs including Gemini, PaLM, and Bard. Through Vertex AI and Google Cloud Platform, they provide enterprise-grade model access with extensive prompt testing via Vertex AI Studio. Google's research has advanced many prompt engineering techniques, including Chain of Thought reasoning methods.
Google develops Gemini, a family of multimodal AI models. The latest flagship, Gemini 3, supports text, image, video, and audio through the Gemini API and Google AI Studio. Google also offers specialized models including Imagen for image generation, Veo for video, and Lyria 3 for music. Their research has advanced many prompt engineering techniques, including Chain of Thought reasoning.

Visit the following resources to learn more:

- [@official@Google AI Studio](https://ai.google.dev/)
- [@official@Gemini API Documentation](https://ai.google.dev/gemini-api/docs)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Hallucination

Hallucination in LLMs refers to generating plausible-sounding but factually incorrect or fabricated information. This occurs when models fill knowledge gaps or present uncertain information with apparent certainty. Mitigation techniques include requesting sources, asking for confidence levels, providing context, and always verifying critical information independently.
Hallucination in LLMs refers to generating plausible-sounding but factually incorrect or fabricated information. This occurs when models fill knowledge gaps or present uncertain information with apparent certainty. Mitigation techniques include requesting sources, asking for confidence levels, providing context, and always verifying critical information independently.

Visit the following resources to learn more:

- [@official@Reduce hallucinations - Anthropic](https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/reduce-hallucinations)
- [@article@What are AI hallucinations? - IBM](https://www.ibm.com/think/topics/ai-hallucinations)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Introduction

Prompt engineering is the practice of designing effective inputs for Large Language Models to achieve desired outputs. This roadmap covers fundamental concepts, core techniques, model parameters, and advanced methods. It's a universal skill accessible to anyone, requiring no programming background, yet crucial for unlocking AI potential across diverse applications and domains.
Prompt engineering is the practice of designing effective inputs for Large Language Models to achieve desired outputs. This roadmap covers fundamental concepts, core techniques, model parameters, and advanced methods. It's a universal skill accessible to anyone, requiring no programming background, yet crucial for unlocking AI potential across diverse applications and domains.

Visit the following resources to learn more:

- [@article@What is Generative AI? - LearnPrompting](https://learnprompting.org/docs/basics/generative_ai)
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ LLM self-evaluation involves prompting models to assess their own outputs for qu

Visit the following resources to learn more:

- [@article@LLM Self-Evaluation](https://learnprompting.org/docs/reliability/lm_self_eval)
- [@article@LLM Self-Evaluation - LearnPrompting](https://learnprompting.org/docs/reliability/lm_self_eval)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# LLM

Large Language Models (LLMs) are AI systems trained on vast text data to understand and generate human-like language. They work as prediction engines, analyzing input and predicting the next most likely token. LLMs perform tasks like text generation, translation, summarization, and Q&A. Understanding token processing is key to effective prompt engineering.
Large Language Models (LLMs) are AI systems trained on vast text data to understand and generate human-like language. They work as prediction engines, analyzing input and predicting the next most likely token. LLMs perform tasks like text generation, translation, summarization, and Q&A. Understanding token processing is key to effective prompt engineering.

Visit the following resources to learn more:

- [@official@LLM - Anthropic Glossary](https://platform.claude.com/docs/en/about-claude/glossary)
- [@article@Differences Between Chatbots and LLMs - LearnPrompting](https://learnprompting.org/docs/basics/chatbot_basics)
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,7 @@ LLMs function as sophisticated prediction engines that process text sequentially

Visit the following resources to learn more:

- [@article@What are large language models (LLMs)? - IBM](https://www.ibm.com/think/topics/large-language-models)
- [@article@Large language model - Wikipedia](https://en.wikipedia.org/wiki/Large_language_model)
- [@article@How Large Language Models Work: Explained Simply](https://justainews.com/applications/chatbots-and-virtual-assistants/how-large-language-models-work/)
- [@video@How Large Language Models Work](https://youtu.be/5sLYAQS9sWQ)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Max Tokens

Max tokens setting controls the maximum number of tokens an LLM can generate in response, directly impacting computation cost, response time, and energy consumption. Setting lower limits doesn't make models more concise—it simply stops generation when the limit is reached. This parameter is crucial for techniques like ReAct where models might generate unnecessary tokens after the desired response. Balancing max tokens involves considering cost efficiency, response completeness, and application requirements while ensuring critical information isn't truncated.
Max tokens setting controls the maximum number of tokens an LLM can generate in response, directly impacting computation cost, response time, and energy consumption. Setting lower limits doesn't make models more concise—it simply stops generation when the limit is reached. This parameter is crucial for techniques like ReAct where models might generate unnecessary tokens after the desired response. Balancing max tokens involves considering cost efficiency, response completeness, and application requirements while ensuring critical information isn't truncated.

Visit the following resources to learn more:

- [@official@Token Counting - Anthropic](https://platform.claude.com/docs/en/build-with-claude/token-counting)
- [@article@Max Tokens - LLM Parameter Guide - Vellum](https://www.vellum.ai/llm-parameters/max-tokens)
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Meta

Meta (formerly Facebook) develops the Llama family of open-source large language models. Llama models are available for research and commercial use, offering strong performance across various tasks. For prompt engineering, Meta's models provide transparency in training data and architecture, allowing developers to fine-tune and customize prompts for specific applications without vendor lock-in.
Meta develops the Llama family of open-source large language models. The latest release, Llama 4, comes in Maverick and Scout variants with strong multimodal and long-context capabilities. Llama models are freely available for research and commercial use, providing transparency in training data and architecture without vendor lock-in.

Visit the following resources to learn more:

- [@official@Llama](https://www.llama.com/)
- [@opensource@Llama Models (GitHub)](https://github.com/meta-llama/llama-models)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Model Weights / Parameters

Model weights and parameters are the learned values that define an LLM's behavior and knowledge. Parameters are the trainable variables adjusted during training, while weights represent their final values. Understanding parameter count helps gauge model capabilities - larger models typically have more parameters and better performance but require more computational resources.
Model weights and parameters are the learned values that define an LLM's behavior and knowledge. Parameters are the trainable variables adjusted during training, while weights represent their final values. Understanding parameter count helps gauge model capabilities - larger models typically have more parameters and better performance but require more computational resources.

Visit the following resources to learn more:

- [@article@What are LLM parameters? - IBM](https://www.ibm.com/think/topics/llm-parameters)
Loading