Batch Size: Tuning the Learning Process
Definition: The number of examples processed in a single training iteration, typically ranging from 1 (SGD) to thousands (full batch).
Small Batch vs. Large Batch
- Small Batch (e.g., 32): Noisy updates. The model “wiggles” around. Good for escaping local minima.
- Large Batch (e.g., 8192): Stable updates. Faster training (parallelism). Can get stuck in “sharp” minima (worse generalization).
The “Context Batch”
In Vibe Coding, “Batch Size” is how much information you feed the AI at once.
- Small Context: You feed one function.
- Pros: The AI focuses perfectly on that function.
- Cons: It might break the rest of the app because it doesn’t see the other files.
- Large Context: You feed the whole codebase.
- Pros: Global awareness.
- Cons: “Lost in the Middle.” It might forget the specific detailed instruction you gave.
Tuning Your Workflow
- Refactoring: Use Large Batch (give it the whole module).
- Bug Fix: Use Small Batch (give it just the error and the crashing function).
- New Feature: Start Large (Architecture), then go Small (Implementation).
Computational Limits
Just as GPU RAM limits training batch size, Context Window limits your prompting batch size.
- Gemini 1.5 Pro: 1M+ tokens. (Infinite Batch Size).
- GPT-4: 128k tokens. (Medium Batch Size).
- DeepSeek: 32k tokens. (Small Batch Size).
Know your model’s capacity and adjust your “information batch” accordingly.
