Replicating Partitioned Learned Bloom Filter

A coursework project that replicates Vaidya et. al's Partioned Learn Bloom Filter.

  • Python
  • NumPy
  • pandas
  • scikit-learn

A coursework project for CS 4964 (Managing Data For ML) that replicates Vaidya et. al's Partioned Learn Bloom Filter (PLBF). A PLBF enhances a traditional boom filter by further decreasing space usage by employing a machine learning classifier. With the EMBER dataset, we were able to decrease space usage by 13.4% with comparable performance to a traditional bloom filter.