Well, its not really wrong- its just efficient-
The stack top is is always topvalue + 1. This is not right behaviour for stack usually- but what we do in our opetation is decrement the value first. It makes the offset zero which is efficient in terms of machine cycles.
But why do that in the first place?
This is because if we do not decrement the stack first before execution of the instruction it becomes complex in case of jmp (branch) instructions. Since we must decrement the stack no matter what we do this first.
Now if we use a negative offset it'll take some extra cycles to decode the instruction with offset than witout the offset.
This post is just to make sure we remember the stack top value points to invalid data. We must decrement it by one to get the top value.