Well, when an instruction is sent to a processor, it will often refer to one or more locations in memory that contain data that the processor is to manipulate. Those locations are expressed as binary numbers. On a 32-bit processor, the numbers can be up to 32-bits long, which means that the biggest possible number is 232, so there are between 0 and 232 memory locations that can be referenced. 232 memory locations is 4GB of memory. So, a 32-bit processor can only work on data in chunks of up to 4GB, which sets one boundary on its performance.
On a 64-bit processor, the numbers used to refer to memory locations can be up to 64-bits long, which means that the biggest possible number is 264, which amounts to more than 16 terabytes of data that the processor can manipulate as a single chunk. So, the amount of data that a 64-bit processor can manipulate as a single chunk is not only mind-bogglingly large, it is also mind-bogglingly larger than the chunks of data 32-bit processors are able to manipulate. Consequently, 64-processors can vastly outperform 32-bit processors. I'll tell you about a very interesting real-world example of that a bit later.