Best Practices
Tips for designing effective scenarios and troubleshooting common issues.
Scenario Design
Start with the End in Mind
Before building, define:
- Learning objective: What should participants be able to do after?
- Prerequisites: What should they already know?
- Time estimate: How long should it take?
Keep It Focused
| Do | Don't |
|---|---|
| One clear objective per scenario | Multiple unrelated topics |
| 3-5 components max for beginners | Overwhelm with infrastructure |
| 15-30 minute completion time | Multi-hour marathons |
Progressive Complexity
Structure scenarios from simple to complex:
Beginner: 1 component, basic operations
↓
Intermediate: 2-3 components, interactions
↓
Advanced: Full stack, complex workflowsReal-World Relevance
Use realistic scenarios:
- Actual use cases from production
- Industry-standard configurations
- Meaningful sample data
Instruction Writing
The WHAT-WHY-HOW Pattern
For each instruction:
markdown
## Create a Kafka Topic
**What:** Create a topic called `orders` with 3 partitions.
**Why:** Partitions allow parallel processing. Three partitions
let us scale to three consumers.
**How:** Open the [Kafka terminal](tab:kafka) and run:
\`\`\`bash
kafka-topics --create --topic orders \
--bootstrap-server localhost:9092 \
--partitions 3
\`\`\`
**Verify:** List topics to confirm creation:
\`\`\`bash
kafka-topics --list --bootstrap-server localhost:9092
\`\`\`
You should see `orders` in the output.Show Expected Output
Always tell participants what success looks like:
markdown
You should see output similar to:
\`\`\`
Created topic orders.
\`\`\`Include Checkpoints
Add verification steps throughout:
markdown
**Checkpoint:** Before continuing, verify:
- [ ] Topic `orders` exists
- [ ] You can produce a test message
- [ ] Consumer receives the messageAnticipate Mistakes
Add troubleshooting tips where errors are common:
markdown
::: tip Troubleshooting
If you see "Connection refused", wait 30 seconds for Kafka
to finish starting, then try again.
:::Resource Management
Right-Size Components
Start minimal, increase if needed:
| Component | Start With | Increase If |
|---|---|---|
| PostgreSQL | 250m / 512Mi | Query timeouts |
| Kafka | 500m / 1Gi | Slow message processing |
| Elasticsearch | 500m / 1Gi | Search timeouts |
| Simple app | 100m / 128Mi | OOM kills |
Calculate Total Resources
Sum all components and add 20% headroom:
PostgreSQL: 250m / 512Mi
Kafka: 500m / 1Gi
API: 200m / 256Mi
─────────────────────────────
Subtotal: 950m / 1792Mi
+ 20%: 190m / 360Mi
─────────────────────────────
Request: 1140m / 2152MiWatch for Resource Contention
If scenarios fail under load:
- Check if pods are being OOM-killed
- Look for CPU throttling
- Consider reducing replica counts
Component Configuration
Use Clear Labels
Labels appear in UI and instructions:
| Good | Bad |
|---|---|
postgres | db1 |
kafka | component-2 |
api-server | my-app |
Set Appropriate Start Order
Order 0: Databases (postgres, redis)
Order 1: Message brokers (kafka)
Order 2: Backend services (api)
Order 3: Frontend (web)Configure Health Checks
Ensure components report ready status correctly:
- Helm charts usually include probes
- Custom apps need explicit configuration
Testing
Test the Happy Path
Run through your scenario as a participant:
- Follow every instruction literally
- Copy-paste all commands
- Verify all expected outputs
Test Error Recovery
Intentionally break things:
- Run commands out of order
- Make typos in commands
- Skip steps and see what happens
Test Time Limits
Ensure the scenario fits the TTL:
- Time yourself completing the lab
- Add buffer for slower participants
- Set TTL to 1.5x your completion time
Test with Fresh Eyes
Have someone unfamiliar with the topic:
- Follow your instructions
- Note where they get confused
- Refine based on feedback
Common Patterns
Database Initialization
markdown
Wait for PostgreSQL to be ready, then create the schema:
\`\`\`bash
# Wait for database
until pg_isready; do sleep 1; done
# Create tables
psql -U postgres -d mydb -f /scripts/schema.sql
\`\`\`Service Dependencies
markdown
Before starting the API, verify Kafka is ready:
\`\`\`bash
kafka-topics --list --bootstrap-server kafka:9092
\`\`\`
If this command succeeds, Kafka is ready.Data Flow Verification
markdown
Let's trace a message through the system:
1. **Produce** a message:
\`\`\`bash
echo "test" | kafka-console-producer --topic orders \
--bootstrap-server kafka:9092
\`\`\`
2. **Consume** to verify:
\`\`\`bash
kafka-console-consumer --topic orders \
--from-beginning --max-messages 1 \
--bootstrap-server kafka:9092
\`\`\`
You should see `test` in the output.Troubleshooting Guide
Scenario Won't Start
| Symptom | Likely Cause | Solution |
|---|---|---|
| Stuck on "Provisioning" | Resource quota exceeded | Reduce component resources |
| Pod stays "Pending" | No available nodes | Check cluster capacity |
| Multiple pods failing | Shared dependency issue | Check start order |
Component Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
ImagePullBackOff | Image doesn't exist | Verify image name and tag |
CrashLoopBackOff | Container keeps crashing | Check logs for errors |
OOMKilled | Out of memory | Increase memory limit |
Evicted | Node pressure | Reduce resource usage |
Connectivity Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
| "Connection refused" | Service not ready | Wait, check readiness |
| "Name not resolved" | Wrong hostname | Use component label |
| "Timeout" | Firewall/network | Check service ports |
Instruction Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
| Tab link doesn't work | Wrong label | Match component label exactly |
| Code not highlighted | Missing language | Add language to fence |
| Images not showing | Wrong path | Use /images/filename.png |
Studio Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
| Canvas won't load | Large scenario | Reduce components |
| Auto-save failing | Network issue | Check connection |
| Changes not reflecting | Cache issue | Refresh browser |
Anti-Patterns to Avoid
Don't: Assume Prior Knowledge
markdown
❌ Bad: "Configure Kafka as usual"
✅ Good: "Set the following Kafka configuration..."Don't: Skip Verification Steps
markdown
❌ Bad: "Create the topic and continue"
✅ Good: "Create the topic, then verify with..."Don't: Use Hardcoded Values
markdown
❌ Bad: "Connect to 192.168.1.100:9092"
✅ Good: "Connect to kafka:9092"Don't: Forget Cleanup Instructions
markdown
❌ Bad: (scenario ends abruptly)
✅ Good: "Your lab will automatically clean up in X minutes.
To stop early, click Stop Lab."Don't: Over-Engineer
markdown
❌ Bad: 15 components for a basic demo
✅ Good: Minimum viable infrastructureChecklist Before Publishing
- [ ] All instructions tested step-by-step
- [ ] Expected outputs documented
- [ ] Tab navigation links work
- [ ] Code blocks have language hints
- [ ] Resources calculated and appropriate
- [ ] Start order configured correctly
- [ ] TTL matches expected completion time
- [ ] Troubleshooting tips included
- [ ] Scenario has clear name and description
Getting Help
If you're stuck:
- Check component logs in the Interfaces view
- Review pod events for deployment issues
- Test components individually to isolate problems
- Ask in your organization for scenario review
