Removing Common Files in E-Discovery Processing: De-NISTing Explained

Reducing the number of documents to review during an e-discovery project is a high priority for both attorneys and their clients. One commonly used technique is to remove files from a document set that are known to belong to certain software
programs. This process is called “Known File Filtering” and is often referred to as
“de-NISTing” since it uses a list of file hashes created by the National Institute of
Standards and Technology ().

The “NIST” list is actually a database called the National Software Reference Library (). This list contains information about software, including “hash” values which uniquely identify the data within a file, regardless of its name, date of creation or location. If two files contain identical data they will also have identical hash values.

The database represents a collection of categorized file information for
software of all kinds. It organizes programs into groups, such as word processing
software, system files, gaming programs, etc. This is different than several earlier
file collections of common computer file information (like HashKeeper), the
does not make a distinction between “good” and “bad” files and does not contain lists of contraband data, such as child pornography.

One of the key features of the is that anyone can submit software for review
and inclusion in the list which has helped keep the list up to date. In fact, many e-
discovery and digital forensics software companies have included the in their
products to assist with culling out irrelevant data in the early stages of investigation.

The currently contains approximately 53,000,000 file entries and a new list is
released every month to address software updates and newly available programs.
Using a complete list and the most current version is a very important step to
properly reduce the amount of data that must be reviewed, since some software is
not automatically updated and does not contain the complete list.

De-ing is a very helpful part of processing but is not a “silver bullet” for
reducing e-discovery document sets. There are certain files or programs in the list
that may actually be relevant depending on the scope of a particular case. For
example, remote access software programs have legitimate IT functions but can also be utilized for nefarious purposes. In situations where misuse of this type of program is suspected, it is critical to communicate this information to an e-discovery or digital forensics vendor to ensure that key information is not inadvertently excluded. Likewise, commercially available data wiping software (such as Evidence Eliminator or Disk Redactor) is certainly present in the but may be a pivotal part of an investigation involving data deletion.

Since the is simply a list of categorized programs that are known, and makes
no distinction between those that are “good” and “bad”, it should not be used
without careful thought. Consideration should be given to the particulars of each
case and whether key information could reside on programs listed on the .

Article is a public service announcement from Avansic.

Avansic is a full service processing vendor – For more information, call 888-808-0337 or visit

Updated 06-02-2010

Back to Top


email (we never post emails)
  Textile Help

Back to Top

Contact GTR News