What is Softmax?
Softmax transforms a vector of numbers into a probability distribution. Each output is between 0 and 1, and all outputs sum to 1.
Temperature Effect:
As you increase the temperature multiplier (scale inputs up), the softmax becomes more "confident" - it approaches a step function where the highest input gets probability ≈1 and others get ≈0. This is what people mean by "acting like a step function"!