:''This article is about checksums calculated using addition. The term "checksum" is sometimes used in a more general sense to refer to any kind of
redundancy check. Checksums on decimal numbers are discussed under
check digit.''
A 'checksum' is a form of
redundancy check, a simple way to protect the integrity of data by
detecting errors in data that are sent through space (
telecommunications) or time (
storage). It works by adding up the basic components of a message, typically the asserted
bits, and storing the resulting value. Anyone can later perform the same operation on the data, compare the result to the authentic checksum, and (assuming that the sums match) conclude that the message was probably not corrupted.
An example of a simple checksum:
★ Given 4 bytes of data (can be done with any number of bytes): 25h, 62h, 3Fh, 52h
★ Step 1: Adding all bytes together gives 118h.
★ Step 2: Drop the Carry
Nibble to give you 18h.
★ Step 3: Get the
two's complement of the 18h to get E8h. This is the checksum byte.
★ To Test the Checksum byte simply add it to the original group of bytes. This should give you 100h.
★ Drop the carry nibble again giving 00h. Since it is 00h this means the bytes were probably not changed.
The simplest form of checksum, which simply adds up the asserted bits in the data, cannot detect a number of types of errors. Such a checksum, for example, is not changed by:
★ reordering of the bytes in the message
★ inserting or deleting zero-valued bytes
★ multiple errors which sum to zero
More sophisticated types of redundancy check, including
Fletcher's checksum,
Adler-32, and
cyclic redundancy checks (CRCs), are designed to address these weaknesses by considering not only the value of each byte but also its position. The cost of the ability to detect more types of errors is the increased
complexity of computing the redundancy check value.
These types of redundancy check are useful in detecting ''accidental'' modification such as corruption to stored data or errors in a communication channel. However, they provide no security against a malicious agent as their simple
mathematical structure makes them trivial to circumvent. To provide this level of integrity, the use of a
cryptographic hash function, such as
SHA-256, is necessary. (Collisions have been found in the popular
MD5 algorithm and finding collisions in
SHA-1 seems possible, but there is no evidence
as of 2006 that SHA-256 suffers similar weaknesses.)
On
Unix, there is a tool called "
cksum" that generates both a 32 bit CRC and a byte count for any given input file.
See also
★
Check digit
★
file verification
★
Hamming code
★
Integrity check value
★
List of checksum algorithms
★
Luhn algorithm
★
Parity bit
External links
★
Additive Checksum Algorithms as Error Detection Codes