GPT-1 Implementation by ChanLumerico · Pull Request #37 · ChanLumerico/lucid

ChanLumerico · 2026-03-27T14:39:07Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 63b45629e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T14:45:13Z

@@ -130,8 +130,8 @@ def roll(
    return func.roll(shifts, axis)(a)


-def gather(a: Tensor, /, dim: int, index: Tensor) -> Tensor:
-    return func.gather(dim)(a, index)
+def gather(a: Tensor, /, axis: int, index: Tensor) -> Tensor:


Preserve dim keyword compatibility in lucid.gather

Renaming the public argument from dim to axis without a compatibility path breaks existing call sites that still pass dim= (for example, lucid/models/text/bert/_model.py uses lucid.gather(..., dim=1, ...) in the default embedding path). In practice, BERT/RoFormer calls that omit token_type_ids will now raise TypeError: gather() got an unexpected keyword argument 'dim', which is a regression outside the new GPT code.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T14:45:13Z

+        attr = getattr(_activation, attr_name, None)
+        if issubclass(attr, Module):
+            if attr.__name__.lower() == act_name:
+                return attr


Return an activation instance, not the activation class

get_activation_module_from_name currently returns the module class itself (e.g., GELU) rather than an instantiated module. _GPTMLP stores this in self.act and then executes self.act(self.c_fc(x)), which invokes class construction with a tensor argument and fails for the default hidden_act='gelu'; this makes GPT forward unusable.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T14:45:13Z

+        max_len = config.max_position_embeddings
+        self.register_buffer(
+            "causal_mask",
+            lucid.tril(lucid.ones(max_len, max_len)).reshape(1, 1, max_len, max_len),


Build GPT causal mask as additive -inf mask

The registered causal_mask is lower-triangular 0/1, but the attention kernel treats attn_mask as an additive bias to scores (masked positions must be large negative values). Adding this 0/1 mask does not block future tokens, so training/inference are no longer causal and tokens can attend to the future.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T14:45:13Z

+        attn_out = attn_out.swapaxes(1, 2).reshape(B, T, C)
+        attn_out = self.resid_drop(self.c_proj(attn_out))
+
+        return attn_out, past_key_value if use_cache else None


Return valid cache objects when use_cache is enabled

When use_cache=True and no past_key_values are provided (first decoding step), _GPTAttention returns None for each layer’s present cache, so the model returns a list of None. Passing that list back into the next forward causes past_key_values[0].get_seq_length() to crash, which breaks iterative generation with returned caches.

Useful? React with 👍 / 👎.

ChanLumerico added 3 commits March 21, 2026 18:57

GPT-1 implementation WIP

b9c6a1f

GPT class implementation finished

5e599eb

release-2.15.9

63b4562

ChanLumerico self-assigned this Mar 27, 2026

chatgpt-codex-connector Bot reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-1 Implementation#37

GPT-1 Implementation#37
ChanLumerico wants to merge 3 commits intomainfrom
gpt

ChanLumerico commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChanLumerico commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant