|
Filters
The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. more...
Home
Bags, Cases & Straps
Binoculars & Telescopes
Camcorder Accessories
Camcorders
Digital Camera Accessories
Digital Cameras
Film
Film Camera Accessories
Film Cameras
Film Processing & Darkroom
Flashes & Accessories
Lenses & Filters
Digital Camera Lenses
Film Camera Lenses
Filter Accessories
Filters
Color Conversion
Enhancing
For Black & White...
Kits & Sets
Other
To fit Canon
To fit Fuji
To fit Kodak
To Fit Minolta
To fit Nikon
To fit Olympus
To fit Sony
Light Balancing
Neutral Density
Other Filters
Polarizing
Skylight
Special Effects
Other Special Effects
Soft Effects
Special Color Effects
Star & Spectral Effects
UV, Haze & Protection
Canon
Hoya
Other Brands
Tiffen
Lens Accessories
Lighting & Studio Equipment
Manuals, Guides & Books
Photo Albums & Archive Items
Printers, Scanners &...
Professional Video Equipment
Projection Equipment
Stock Photography & Footage
Tripods, Monopods
Vintage
Wholesale Lots
Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives.
For example, one might use a Bloom filter to do spell-checking in a space-efficient way. A Bloom filter to which a dictionary of correct words have been added will accept all words in the dictionary and reject almost all words which are not, which is good enough in some cases. Depending on the false positive rate, the resulting data structure can require as little as a byte per dictionary word.
One peculiar attribute of this spell-checker is that it is not possible to extract the list of correct words from it – at best, one can extract a list containing the correct words plus a significant number of false positives. This limitation can be considered a feature, when you want to check for a set of items without disclosing those items; for example in a security application which scans your disk for Social Security numbers; or in a program to scrub opted-out email addresses from the lists of mass mailers, where you do not want to make known any of the opted-out addresses to the companies using your list. This is not a completely secure solution, however, as it may be possible to separate the false positives from the real data by some other means.
Algorithm description
An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which maps a key value to one of the m array positions.
For a good hash function with a wide output, there should be little if any correlation between different bit-fields of such a hash, so this type of hash can be used to generate multiple "different" hash functions by slicing its output into multiple bit fields. Alternatively, one can pass k different initial values (such as 0, 1, ..., k-1) to a hash function that takes an initial value; or add (or append) these values to the key.
To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1.
To query for an element (test whether it is in the set), feed it to each of the k hash functions to get k array positions. If any of the bits at these positions are 0, the element is not in the set – if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set, or the bits have been set to 1 during the insertion of other elements.
Read more at Wikipedia.org
|
|