Big-endian vs little-endian is another compatibility issue
with various processor architectures. The issue stems from
the byte ordering of multibyte constants. The x86
architecture is little-endian.
For example, the hexadecimal number 0x12345678 is stored in memory as:
address contents
0 0x78
1 0x56
2 0x34
3 0x12
A big-endian processor would store the data in the following order:
address contents
0 0x12
1 0x34
2 0x56
3 0x78
This issue is worrisome on a number of fronts:
- typecast mangling
- hardware access
- network transparency
The first and second points are closely related.
- Typecast mangling
- Consider the following code:
func ()
{
long a = 0x12345678;
char *p;
p = (char *) &a;
printf ("%02X\n", *p);
}
On a little-endian machine, this prints the value 0x78; on a big-endian machine, it
prints 0x12.
This is one of the big (pardon the pun) reasons why structured programmers generally frown on
typecasts.
- Hardware access
- Sometimes the hardware can present you with a conflicting choice of the correct
size for a chunk of data.
Consider a piece of hardware that has a 4 KB memory window.
If the hardware brings various data structures into view with that window, it's impossible to determine
a priori what the data size should be for a particular element of the window.
Is it a 32-bit long integer?
An 8-bit character?
Blindly performing operations as in the above code sample will land you in trouble,
because the CPU will determine what it believes to be the
correct endianness, regardless of what the hardware manifests.
- Network transparency
- These issues are naturally compounded when heterogeneous CPUs are used in a network with messages
being passed among them.
If the implementor of the message-passing scheme doesn't decide up front what byte order will be used, then
some form of identification needs to be done so that a machine with a different byte ordering can receive and
correctly decode a message from another machine.
This problem has been solved with protocols like TCP/IP, where a defined network byte order
is always adhered to, even between homogeneous machines whose byte order differs from the network byte order.