Bidirectional Models: Seeing the Whole Picture
Definition: A language model determining token probability based on both preceding and following context text.
Uni-directional vs. Bi-directional
- Uni (GPT): “The cat sat on the [MASK]”. (Can only use “The cat sat on the” to guess).
- Bi (BERT): “The [MASK] sat on the mat”. (Can use “The” AND “sat on the mat” to guess).
Coding is Bidirectional
Code is highly interdependent. A function at the bottom of the file might call a variable defined at the top. A variable type might be inferred from how it’s used 50 lines later.
- The Challenge: GPT (Uni) reads code like a human reads a novel—start to finish. It sometimes misses the “future context.”
In-Filling (FIM)
Tools like Copilot use a trick called Fill-In-The-Middle (FIM).
- They take the prefix (code before cursor) and suffix (code after cursor).
- They format it like:
<PRE> ...code... <SUF> ...code... <MID> - This tricks the Uni-directional model into seeing “both sides” so it can generate the connecting code.
Practical Use
When using Vibe Coding for Refactoring:
- Highlight the code around the change, not just before it.
- Give the AI the “Suffix” (the code that comes after).
- Bad: “Here is the start of the function. Finish it.”
- Good: “Here is the start AND the return statement at the end. Fill in the logic between them.”
This constraints the AI (using the bidirectional context) to write code that actually fits.
