What is Softmax?
Softmax transforms a vector of numbers into a probability distribution. Each output is between 0 and 1, and all outputs sum to 1.
Temperature Effect:
As you increase the temperature multiplier (scale inputs up), the softmax becomes more "confident." At high temperatures, this approaches a step function where the highest input gets probability ≈ 1 and others get ≈ 0.