You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: public/content/learn/attention-mechanism/calculating-attention-scores/calculating-attention-scores-content.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -202,16 +202,16 @@ print(attn_weights)
202
202
**After softmax (each row sums to 1):**
203
203
```
204
204
Pos0 Pos1 Pos2
205
-
Query0 [0.576, 0.212, 0.212] ← Mostly attends to position 0
206
-
Query1 [0.212, 0.576, 0.212] ← Mostly attends to position 1
205
+
Query0 [0.506, 0.186, 0.308] ← Mostly attends to position 0
206
+
Query1 [0.186, 0.506, 0.308] ← Mostly attends to position 1
207
207
Query2 [0.333, 0.333, 0.333] ← Attends equally to all
208
208
```
209
209
210
210
### Understanding the Result
211
211
212
212
**Position 0:**
213
213
- Query matched Key0 best (score 2.0 before scaling)
Copy file name to clipboardExpand all lines: public/content/learn/math/functions/functions-content.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,7 +53,7 @@ For x = -1:
53
53
54
54
f(-1) = 2(-1) + 3 = -2 + 3 = 1
55
55
56
-
Now image a function that takes in "Cat sat on a" and returns "mat" - that function would be a lot more difficult to create, but neural networks (LLMs) can learn it.
56
+
Now imagine a function that takes in "The cat sat on a" and returns "mat" - that function would be a lot more difficult to create, but neural networks (LLMs) can learn it.
57
57
58
58
### Example 2: Quadratic Function f(x) = x² + 2x + 1
59
59
@@ -109,7 +109,7 @@ Previous quadratic function will always give 9 if x=2 and nothing else.
109
109
110
110
## Code Examples
111
111
112
-
Our 2 functions coded in python, if you are unfamiliar with python you can skip the code, next module will focus on python.
112
+
Our 2 functions coded in Python, if you are unfamiliar with Python you can skip the code, next module will focus on Python.
**e** is a famous constant (Euler's number) used in math everywhere, it's value is approximately 2.718
367
+
**e** is a famous constant (Euler's number) used in math everywhere, its value is approximately 2.718
368
368
369
369
**f(x) = 1 / (1 + e^(-x))**
370
370
@@ -379,7 +379,7 @@ def sigmoid_derivative(x):
379
379
380
380

381
381
382
-
We will learn derivativers in the next lesson, but I included the images here - derivative tells you how fast the function is changing - you see that when sigmoid function is growing fastest (in the middle), the derivative value is spiking.
382
+
We will learn derivatives in the next lesson, but I included the images here - derivative tells you how fast the function is changing - you see that when sigmoid function is growing fastest (in the middle), the derivative value is spiking.
383
383
384
384
Just look at the slope of the function, if it's big (changing fast), the derivative will be big.
0 commit comments