loss as a function of error
regression: drag points to add outliers
L2 slope = ·
L1 slope = ·
gradient: how each loss reacts to errors
L2 gradient grows with error — L1 gradient is constant
huber loss: the compromise
signal fitting: same network, different loss
— true signal
— L2 network
— L1 network
epoch 0 / 3000