AI/ML Penetration Testing
Master offensive security testing of AI and machine learning systems including large language models, classifiers, recommendation engines, RAG pipelines, and autonomous agents. Learn prompt injection, jailbreaks, adversarial examples, model extraction, training-data poisoning, supply-chain attacks, and governance assessment aligned with OWASP LLM Top 10, OWASP ML Top 10, NIST AI RMF, and MITRE ATLAS.
Duration
3 Months / 12 Weeks / 90 Hours
Level
Advanced
Modules
13 Modules
Format
Hands-on Labs
What You'll Learn
The AI/ML Penetration Testing course is designed for penetration testers, red teamers, application security engineers, ML engineers, and AI security researchers who need to evaluate the security posture of modern AI systems. The curriculum covers the full attack surface of LLM-powered applications, traditional ML pipelines, and agentic systems. You will perform prompt injection and jailbreak attacks, build adversarial example payloads with FGSM, PGD and Carlini-Wagner, execute training-data poisoning and backdoor insertion, attack RAG pipelines and embedding stores, abuse Model Context Protocol (MCP) servers and tool-calling agents, conduct model extraction and inversion attacks, and assess ML supply chains and governance controls against NIST AI RMF and MITRE ATLAS.
// Prerequisites
- - Solid penetration testing or application security foundation
- - Working Python proficiency for scripting and tooling
- - Familiarity with REST APIs, HTTP, and Burp Suite
- - Basic understanding of machine learning concepts (training, inference, loss, gradients)
- - Comfort with Linux command line and Git
- - Laptop with 16GB+ RAM (GPU helpful but not required)
$ armour --training ai-ml-pt --info
[*] Course: AI/ML Penetration Testing
[*] Duration: 3 Months / 12 Weeks / 90 Hours
[*] Level: Advanced
[!] 13 modules | 97 topics
[+] Lab environment: READY
[+] Certification prep: INCLUDED
$ _
Complete Course Modules
AI/ML Security Landscape & Threat Modeling
- > Modern AI/ML deployment patterns
- > LLM applications, RAG, and agentic systems
- > Traditional ML vs generative AI threat models
- > STRIDE applied to ML systems
- > OWASP LLM Top 10 walkthrough
- > OWASP Machine Learning Top 10 walkthrough
- > Mapping attacker goals to ML kill chain
MITRE ATLAS & NIST AI RMF
- > MITRE ATLAS tactics and techniques
- > ATLAS Navigator for engagement planning
- > NIST AI Risk Management Framework (AI RMF) overview
- > Govern, Map, Measure, Manage functions
- > Mapping findings to ATLAS and AI RMF
- > Engagement scoping for AI red teaming
- > Rules of engagement for production LLMs
LLM Prompt Injection
- > Direct prompt injection fundamentals
- > Indirect prompt injection via documents, URLs, and tools
- > System prompt extraction
- > Instruction hierarchy bypass
- > Multilingual and obfuscated payloads
- > Token-level smuggling and Unicode tricks
- > Cross-context injection in chat histories
- > Building a prompt injection harness in Python
Jailbreaks & Guardrail Bypass
- > Role-play and persona-based jailbreaks
- > Many-shot and crescendo jailbreaks
- > Encoding, translation, and cipher-based bypasses
- > Refusal classifier evasion
- > Bypassing content filters and moderation APIs
- > Automated jailbreak discovery with Garak and PyRIT
- > Tracking guardrail regressions across model versions
RAG Pipeline Attacks
- > Anatomy of a RAG system
- > Indirect injection through ingested documents
- > Embedding inversion and recovery of source text
- > Vector store poisoning
- > Retrieval ranking manipulation
- > Cross-tenant data leakage in multi-tenant RAG
- > Metadata and citation forgery
- > Hardening retrieval pipelines
MCP & Agentic System Abuse
- > Model Context Protocol (MCP) architecture
- > Malicious MCP server design
- > Tool poisoning and tool-description injection
- > Confused-deputy attacks on autonomous agents
- > Function-calling abuse and argument smuggling
- > Privilege escalation through chained tools
- > Sandbox escapes in code-execution tools
- > Auditing agent action logs
Adversarial Examples for ML Classifiers
- > Adversarial example theory and threat models
- > Fast Gradient Sign Method (FGSM)
- > Projected Gradient Descent (PGD)
- > Carlini-Wagner (CW) attacks
- > Transferability across models
- > Black-box query-based attacks
- > Physical-world adversarial patches
- > Using Adversarial Robustness Toolbox (ART)
Training-Data Poisoning & Backdoors
- > Clean-label vs dirty-label poisoning
- > Backdoor trigger design
- > BadNets and label-flipping attacks
- > Poisoning fine-tuning datasets
- > Poisoning RLHF preference data
- > Detection via spectral signatures and activation clustering
- > Supply-chain implications of public datasets
Model Extraction & Model Inversion
- > Query-based model extraction
- > Functional equivalence vs fidelity extraction
- > Extracting decision boundaries
- > Model inversion to reconstruct training data
- > Membership inference attacks
- > Attribute inference attacks
- > Privacy budget and differential privacy basics
- > Defensive rate limiting and watermarking
ML Supply Chain Attacks
- > Hugging Face Hub threat surface
- > Malicious pickle and safetensors payloads
- > Compromised model cards and configs
- > Dependency confusion in ML tooling
- > Notebook and pipeline poisoning
- > Container and base-image risks for ML workloads
- > Signing, attestation, and SBOMs for models
LLM Application Security Testing
- > Testing LLM-backed web APIs with Burp Suite
- > Authentication and authorization in LLM apps
- > Output handling: XSS, SSRF, and command injection via LLM responses
- > Insecure plugin design (OWASP LLM07)
- > Excessive agency (OWASP LLM08)
- > Sensitive information disclosure
- > Denial-of-wallet and cost-exhaustion attacks
Tooling Deep Dive
- > Garak vulnerability scanner for LLMs
- > Microsoft PyRIT for automated red teaming
- > Adversarial Robustness Toolbox (ART)
- > Counterfit for ML attack automation
- > MITRE ATLAS Navigator workflows
- > Building custom prompt-injection harnesses in Python
- > HuggingFace transformers for offline attack research
- > Integrating tools with Burp and CI pipelines
Reporting, Governance & AI Red Team Operations
- > Structuring AI red team findings
- > Severity scoring for AI/ML vulnerabilities
- > Mapping findings to OWASP, ATLAS, and NIST AI RMF
- > Governance review: model cards, data sheets, evaluation suites
- > Continuous AI red teaming programs
- > Communicating risk to ML and product teams
- > Building safe disclosure and remediation workflows
Learning Outcomes
- Scope and execute end-to-end penetration tests of LLM-powered applications
- Perform direct and indirect prompt injection against production-style systems
- Build adversarial examples using FGSM, PGD, and Carlini-Wagner attacks
- Conduct training-data poisoning and backdoor insertion exercises
- Attack RAG pipelines, vector stores, and embedding-based retrieval
- Abuse MCP servers and tool-calling agents through confused-deputy and tool poisoning
- Execute model extraction, inversion, and membership inference attacks
- Assess ML supply chains for malicious models, pickles, and dependency risks
- Use Garak, PyRIT, ART, Counterfit, and custom harnesses effectively
- Map findings to OWASP LLM Top 10, OWASP ML Top 10, MITRE ATLAS, and NIST AI RMF
- Produce professional AI red team reports with prioritized remediation
Hands-On Labs
- Self-hosted vulnerable LLM chatbot with system prompts and tools
- RAG pipeline lab with poisonable document store and vector DB
- Agentic system lab built on Model Context Protocol (MCP)
- Image and tabular classifier lab for adversarial example exercises
- Training-data poisoning sandbox with reproducible fine-tuning jobs
- Garak and PyRIT preconfigured against local and API targets
- Adversarial Robustness Toolbox (ART) and Counterfit ready to run
- Burp Suite with custom extensions for LLM API testing
- Python + HuggingFace transformers environment with sample models
- MITRE ATLAS Navigator workspace for engagement tracking
Certification Preparation
- +OSCP+ (Offensive Security Certified Professional+)Advanced Offensive Security Certification
- +CEH (Certified Ethical Hacker)
- +OWASP LLM Top 10 alignment
- +OWASP Machine Learning Top 10 alignment
- +MITRE ATLAS-aligned AI red team methodology
- +NIST AI Risk Management Framework readiness
- +Generic AI red-teaming and AI security assessor preparation
Training Mode
One unified programme delivered in two parallel modes — same curriculum, same trainers, same certification.
- Online Live Classes (Instructor-led)
- On-Premise Classroom Training (Indore Centre)
- Both modes run concurrently
- Students can choose either mode
- Same curriculum for both formats
- Same certification for both tracks
Meet Your Instructor
Armour Infosec AI Security Team
Lead AI Red Team Instructor
Our AI security instructors are active red teamers who assess LLM-powered products, agentic systems, and ML pipelines for enterprise clients. They combine traditional offensive security backgrounds with hands-on machine learning experience and bring real engagement patterns, custom prompt-injection harnesses, and tool chains including Garak, PyRIT, ART, Counterfit, Burp Suite, and HuggingFace transformers directly into the lab.
What Students Say
“The RAG and MCP labs are unlike anything I have seen elsewhere. I walked into my first AI red team engagement with a methodology I actually trusted.”
Aditya R.
Application Security Engineer
“As an ML engineer pivoting into security, the adversarial examples and poisoning modules connected the math I knew to real attacker behaviour. Garak and PyRIT now feel routine.”
Sneha P.
ML Engineer
“OWASP LLM Top 10, ATLAS, and NIST AI RMF were abstract for me until this course. The instructors map every finding back to a framework so reports basically write themselves.”
Karthik V.
Penetration Tester
Frequently Asked Questions
Common questions about the course, enrollment, and certification.
Ready to Enroll?
Secure your spot in the next batch. Limited seats available for hands-on lab access.