.Net Reduced CRC Library
Per la versione italiana della pagina cliccare qui.
Index
CRC Fundamentals
Why CRC?
CRC means cyclic redundancy check. Great. But what's for? A CRC is a number associated to a bit stream. Imagine you have to send 8 bytes. How could you know they were sent correctly? If you must be sure you have no other choice than asking the bytes back. If you receive the same 8 bytes you can be (almost) sure the bytes were sent with no errors. This works fine if you need an high degree of trust. The big problem is efficiency: every byte sent will travel trough the network at least twice. In some situations you need a dirtier method.
First solution
You can ask for the sum of the bytes received. If you send the number 8 and the number 22 you know you have to receive 30. In this way for each pair of byte you have to receive another one. Much better than before (1:2). This method isn't as good as you may think. If the unlucky receiver get 15 and 15 you'll never know. Another problem is overflow: it's easy to cross the byte limit. You definitely need a stronger solution.
CRC
Think about the overflow problem. What's the integer operator that never exceeds a given limit? Yes, it's the modulo. n % 2 gives 0 or 1. No way out. The CRC works in this way. It uses some space shuttle function (also known as polynomial) to get the modulo out of a set of bytes. The results are statistically good (unless you pick an abnormal ratio). You don't have to know how exactly the calculations are to use a CRC library (such as mine). You give it a stream of bytes and get back it's CRC.
For the instance internet itself uses CRC for it's TCP. This way you are (statistically) sure you'll get exactly what you asked for.
Index
Why reduced?
Don't be fooled by the reduced term. This library works well (at least it should ). Reduced means that it doesn't follow the standard CRC implementations. For instance when you checksum a big file you should apply the CRC to the whole file's bytes. This is disk intensive: a videoclip can easily be 500 MBytes. Even the fastest hard disk will knee in front of it. My approach let you have a CRC based on random fragments of that big file instead of all of it. This means that is easier to get the same CRC from two different files but the scanning is much (much) faster. The risk of wrong CRC is increased by 1 per million. I think it's a deal.
However I won't decide for you. If you absolutely need the complete CRC you can have it simply by treating the file as a big stream of bytes (as you should do when dealing with non-skippable streams).
Index
Installation
You should download the file. Unzip it in a temporary folder. Now you can register it in the global assembly cache (GAC) or just place in a staging folder. To install it in the GAC type
gacutil /i MindFlavor.CRC.dll from the command line.
Index
Use
Clicking here will open a new window with the complete documentation. The help file is included with the downloaded package.
Using the class is easy: just call its static method CalculateCRC. You can pass as parameter a System.IO.FileInfo or a string to get the reduced CRC from a file. If you need to checksum a stream of bytes there is an overload just for this.
Index
An Example
I've written a simple console application to test the library. You can read the complete code here. Here I'll skip the reflection code and focus on the library one.
...
...
...
foreach(FileInfo fi in files)
{
uint uiCRC = CRC.CalculateCRC(fi);
Console.WriteLine(fi.Name + "\tCRC: 0x" + uiCRC.ToString("x"));
}
...
...
... We just pass the System.IO.FileInfo instance to the static method CalculateCRC and get the results back (as unsigned integer). You may wonder way you get a 32-bit type since it's a 16-bit CRC. The answer is efficiency.
Index
Download
Index
|