May 7, 2026
Preparing for the AWS AI Practitioner Certification
A condensed, exam-shaped study guide — concepts, AWS services, and the question patterns that actually show up.
These are my consolidated notes for AWS Certified AI Practitioner (AIF-C01). The exam is foundational — broad rather than deep — so the trick is recognising the pattern in each question, not memorising algorithm internals. I’ve organised this around how the questions actually read.
Exam logistics
- 65 questions, 90 minutes. Only 50 are scored — the other 15 are unscored research items, indistinguishable from real ones. Don’t skip anything.
- Mix of multiple-choice and multi-response.
- AWS doesn’t publish the exact passing score; aim for 75%+ on practice tests to feel safe.
Part 1 — Machine learning foundations
The vocabulary that trips people up
| Term | Meaning |
|---|---|
| Feature | An input attribute. (Square footage of a house.) |
| Label | The target output. (House price.) |
| Weights | How important each feature is to the model’s prediction. |
| Parameters | Internal variables learned during training. |
Parameters = weights + biases + others. All weights are parameters; not all parameters are weights. Expect a question on this.
When ML is the right tool — and when it isn’t
Good fits: recommendations, demand forecasting, image/video recognition, sentiment analysis, translation, personalization, fraud detection.
Bad fits — pick these in “when should we not use ML?” questions:
- Rapidly changing environments (model goes stale faster than you can retrain)
- Safety-critical systems demanding 100% accuracy
- Hard regulatory constraints requiring full explainability
- Tiny datasets
- Domains where bias amplification is unacceptable
The three learning paradigms
| Paradigm | Data | What it does | Examples |
|---|---|---|---|
| Supervised | Labeled | Predict | Spam classification, house-price regression |
| Unsupervised | Unlabeled | Discover patterns | Customer clustering, anomaly detection |
| Reinforcement | Reward signals | Learn from interaction | Game-playing, AWS DeepRacer |
Supervised algorithms worth knowing
| Algorithm | Task | Typical use |
|---|---|---|
| Linear regression | Regression | Sales / price forecasting |
| Logistic regression | Classification | Spam detection |
| Decision tree | Both | Credit approval |
| Random forest | Both | Fraud detection |
| Support Vector Machine (SVM) | Both | Image classification |
| K-Nearest Neighbours (KNN) | Both | Distance-based recommendations |
| Naive Bayes | Classification | Text classification |
| Gradient boosting | Both | Tabular Kaggle wins |
| Neural networks / deep learning | Both | Most modern systems |
Unsupervised algorithms
- K-means clustering — partitions into a predefined number of groups based on similarity.
- Hierarchical clustering — builds a tree of nested clusters.
- PCA — dimensionality reduction.
- Autoencoders — neural compress/decompress; useful for anomaly detection too.
- Association rules — “people who bought X also bought Y.”
- Isolation Forest / Random Cut Forest (RCF) — anomaly detection. RCF is AWS’s built-in algorithm.
The ML lifecycle
Business problem
→ Problem formulation
→ Data collection & integration
→ Preprocessing & visualization
→ Model training
→ Evaluation
→ Tuning & feature engineering
→ Deployment
→ Monitoring → (loop back to data)
Two stages that look similar but aren’t:
- Data collection — ingest, aggregate, label. Adds rows.
- Feature engineering — create / transform / extract / select features. Adds variables (columns).
- Hyperparameter tuning — adjust the algorithm’s behaviour. Adds neither.
Inference modes (memorise this — easy points)
| Mode | Use when |
|---|---|
| Real-time | Low-latency interactive predictions; persistent endpoint |
| Serverless | Intermittent traffic; you don’t want idle cost |
| Asynchronous | Payloads up to 1 GB, processing up to 1 hour, queue-based |
| Batch transform | Large offline jobs, no endpoint needed |
Part 2 — Foundation models and Generative AI
The hierarchy
AI ⊃ Machine Learning ⊃ Deep Learning ⊃ Generative AI
Foundation Models (FMs)
- Pre-trained on massive unlabeled datasets. (Classic ML uses labeled data; this is the key difference.)
- Billions of parameters.
- Trained via self-supervised learning — the model creates its own labels from the data itself (e.g., “predict the next word”, “predict the masked word”). No human annotation.
FM lifecycle
Data Selection → Pre-training → Optimization → Evaluation
→ Deployment → Feedback & continuous improvement → (loop)
LLMs — a special case of FM
- Predict the probability of the next token.
- Built on the transformer architecture:
- Self-attention lets the model weigh every token against every other token.
- Position-aware (positional encodings) so word order matters.
- Parallelizable on GPUs (key reason transformers won over RNNs).
Input → Tokenization & encoding → Word embedding → Decoding → Output
Limitations to remember
- Hallucinations — confidently wrong output
- Inaccuracy / outdated knowledge
- Non-determinism (same prompt, different outputs)
- Verbosity (“chatty”)
- Poor interpretability (you can’t easily explain why it answered)
Part 3 — Customizing foundation models
The AWS exam loves the effort/cost spectrum:
Prompt Engineering → RAG → Fine-tuning → Continued Pre-training → Train from scratch
↑ cheap, fast expensive, slow ↑
Pick the leftmost option that solves the problem.
Prompt engineering techniques
| Technique | What it is |
|---|---|
| Zero-shot | Just ask the question; no examples |
| One-shot | Show one example of the task |
| Few-shot | Show 2+ examples |
| Chain-of-thought | ”Think step by step” — surfaces reasoning |
| Negative prompting | Explicitly say what not to include |
| ReAct | Chain-of-thought + tool/API calls (e.g., REST integration) |
Prompt templates are reusable formats with placeholders for variable input. They give you consistency, fewer errors, and easy iteration. Show up in workflow questions.
RAG (Retrieval-Augmented Generation)
Inject external knowledge at query time. Pick RAG when:
- The information isn’t in the training data (your latest internal docs, a fresh policy)
- You want to cite sources
- You want to swap knowledge without retraining
AWS vector stores you should recognize:
- Amazon OpenSearch / OpenSearch Serverless
- Amazon RDS for PostgreSQL (with
pgvector) - Amazon Neptune
- Amazon DocumentDB
- Amazon Kendra (managed intelligent search — also a valid RAG knowledge source)
Fine-tuning vs. continued pre-training
| Fine-tuning | Continued pre-training | |
|---|---|---|
| Data | Labeled, domain-specific | Unlabeled, domain-specific |
| Goal | Improve a specific task | Adapt the model to a new domain |
| Modifies weights? | Yes | Yes |
If a question mentions labeled data → fine-tuning. Unlabeled → continued pre-training.
Inference-time parameters
- Temperature (0–1) — controls randomness. Closer to 0 = deterministic. Set to 1 = maximum creativity (and maximum hallucination risk).
- Top-K — sample from the top K most likely tokens.
- Top-P (nucleus sampling) — sample from the smallest set of tokens whose cumulative probability ≥ P.
Original tokens → Temperature → Top-K → Top-P → Random selection
“Reduce randomness in the LLM’s output” → lower the temperature.
Guardrails
- Bedrock Guardrails filter inputs/outputs for PII, toxic content, and prohibited topics.
- AWS Audit Manager has a prebuilt framework specifically for auditing GenAI applications built on Amazon Bedrock. Show up in compliance questions.
Part 4 — Evaluation metrics
Classification
| Metric | Use |
|---|---|
| Accuracy | Overall % correct (misleading on imbalanced data) |
| Precision | Of predicted positives, how many were truly positive |
| Recall | Of actual positives, how many we caught |
| F1 | Harmonic mean of precision and recall — best for imbalanced datasets |
| AUC-ROC | Binary classification; especially imbalanced scenarios like fraud detection |
| Confusion matrix | Visualizes TP / TN / FP / FN |
F1 is the go-to answer for imbalanced binary classification unless the question stresses ranking / threshold tradeoffs — then it’s AUC-ROC.
Regression
- MAE, MSE, RMSE, MAPE.
- MSE is regression-only. If a question offers MSE for a classification problem, it’s a distractor.
Text generation
| Metric | Use |
|---|---|
| BLEU | Machine translation (compares n-grams to reference) |
| ROUGE | Summarization |
| BERTScore | Semantic similarity using contextual embeddings (e.g., comparing a chatbot’s response to an expert answer) |
F1 does not evaluate text generation. Don’t pick it for a generation task.
Bias and variance
- High bias → underfitting (poor on both train and test).
- High variance → overfitting (great on train, bad on test).
If accuracy is high on training data but low on test data, it’s overfitting.
Part 5 — AWS services cheat sheet
SageMaker family
| Service | What it does |
|---|---|
| SageMaker Canvas | No-code ML predictions |
| SageMaker Autopilot | AutoML — automated build & deploy |
| SageMaker Data Wrangler | Import, prepare, transform, featurize |
| SageMaker Feature Store | Centralized feature repository for training & inference |
| SageMaker JumpStart | Pre-trained open-source models (great for summarization questions) |
| SageMaker Model Cards | Document key model details for governance |
| SageMaker Model Monitor | Drift detection in production |
| SageMaker Clarify | Bias reports + bias-drift monitoring |
| SageMaker inference endpoints | Hosted prediction endpoints; AWS manages infra |
Model Monitor + Clarify together watch four dimensions: data quality, model quality, bias drift, feature attribution drift.
Higher-level AI services (no ML expertise required)
| Service | Use it for |
|---|---|
| Amazon Comprehend | NLP — sentiment, entities, key phrases (e.g., analyzing customer reviews) |
| Amazon Rekognition | Image and video analysis |
| Amazon Textract | Extract text and structured data from documents, PDFs, images (invoices, receipts) |
| Amazon Personalize | Recommendations and user-segment targeting (e.g., marketing campaigns) |
| Amazon Q | Code assistant — chat about code, completions, security scans, language upgrades |
| Amazon Augmented AI (A2I) | Human-in-the-loop review of ML predictions for high precision |
Security, governance, compliance
| Service | Purpose |
|---|---|
| AWS Artifact | Self-service compliance reports (ISO, SOC, PCI), HIPAA BAAs |
| AWS Audit Manager | Continuous compliance auditing — including a prebuilt GenAI framework |
| AWS Config | Resource configuration history & compliance |
| AWS Trusted Advisor | Best-practice recommendations across cost, security, performance, resilience |
| Amazon GuardDuty | ML-based threat detection across AWS accounts |
| Amazon Macie | Discover sensitive data inside S3 buckets |
| Amazon Inspector | Vulnerability scanning for workloads |
Data lineage = tracking the flow and transformation of data, for privacy/compliance. Recognize the term.
Part 6 — Question patterns that show up
If you see these phrases, the answer is almost always:
| The question says… | The answer is… |
|---|---|
| ”Generates new data resembling existing data” | GAN (Generative Adversarial Network) |
| “AI plays a complex strategy game” | Reinforcement learning |
| ”Adapt a pre-trained model to a new task” | Transfer learning |
| ”Generate images from text descriptions” | Diffusion models / Stable Diffusion |
| ”High train accuracy, low test accuracy” | Overfitting |
| ”Reduce randomness in LLM output” | Lower the temperature |
| ”Real-time inference up to 1 GB / 1 hour” | Asynchronous inference |
| ”Find sensitive data in S3” | Macie |
| ”Threat detection across AWS” | GuardDuty |
| ”Analyze customer review sentiment” | Comprehend |
| ”Extract data from invoices / receipts” | Textract |
| ”Audit a GenAI app for compliance” | Audit Manager (prebuilt framework) |
| “Recommend a customer segment for a campaign” | Personalize |
| ”Document model details for governance” | SageMaker Model Cards |
| ”Imbalanced binary classification metric” | F1 or AUC-ROC |
| ”Bias report on the model” | SageMaker Clarify |
| ”No-code ML for business users” | SageMaker Canvas |
| ”AutoML — build & deploy automatically” | SageMaker Autopilot |
| ”Pre-trained model for summarization” | SageMaker JumpStart |
Final prep tips
- Read AWS’s official exam guide once more in the last week. It’s short and tells you exactly what they care about.
- Memorize the SageMaker family. It’s worth roughly 8–10 questions.
- Know inference modes by their constraints — payload size, latency, async vs. sync. Easy points.
- Watch for “does NOT do X” wording. The trap is usually a service that almost fits but is missing a feature (e.g., Rekognition can’t summarize text; Personalize can’t generate images).
- Hit 80%+ on a full-length practice exam before booking the real one.
Good luck. The exam rewards pattern recognition over depth — if you’ve internalised the cheat sheet above, you’ll have time to spare.