Informatik 5
Information Systems
Prof. Dr. M. Jarke
Prof. Dr. M. Jarke
RWTH Aachen
Informatik 5
Locality-sensitive Hashing using not-so-random Hash Functions

Thesis type
  • Master
Status Open

Locality-sensitive hashing is used to speed up near-neighbor search in high dimensional space. When the distance of interest is cosine distance, Random hyperplane hashing (RHH) is used. This technique is based on randomly selecting hyperplanes. However, in some cases (when we have more information about the dataset) it seems reasonable to not choose the hyperplanes completely randomly. Further, if normal RRH is performed with a low number of hyperplanes, then the hyperplanes are likely to not cover the space very well. This thesis will be about choosing the hyperplane in a data dependent way and try to sample the hyperplanes such that they cover the space nicely (including a comparison with angular quantization).

