You See Neural Nets Wrong
A lot of beginner explanations make neural networks look like mystical webs of circles. That picture is memorable, but it is also the wrong thing to internalize. A neural network is not primarily a diagram. It is a composition of functions, and most of those functions are matrix multiplications followed by simple nonlinearities.
The clean mental model is:
That is the forward pass. Data comes in as a matrix, weights transform it, biases shift it, nonlinearities bend it, and the next layer repeats the same pattern.
Why the circle diagram misleads
The circle diagram makes you think neuron by neuron. That is useful for the first five minutes, but real models are not implemented neuron by neuron. They are implemented as dense tensor operations.
For one neuron, you can say:
For a whole layer, this becomes:
That single equation is the layer. The same thing that looked like many little arrows is really a batch matrix multiplication.
What learning means
Learning means changing the weights so the function produces better outputs. The loss measures how wrong the prediction is:
Backpropagation computes how much each parameter contributed to the loss:
Then the optimizer nudges weights in the opposite direction:
This is not magic. It is bookkeeping through a chain of matrix operations.
The important thing to memorize
Do not memorize neural nets as circles connected by arrows. Memorize them as repeated blocks:
- multiply by weights
- add bias
- apply nonlinearity
- compute loss
- propagate gradients backward
- update weights
That picture scales. MLPs, CNNs, transformers, and diffusion models all become less mysterious when you ask: what tensor shape enters this block, what operation transforms it, and where do gradients flow?
If you can track shapes, operations, and gradients, neural networks stop being diagrams and start being programs.