A search engine using Hadoop
The aim of the Person Search System project was to crawl the web and find individuals, like many of these other person-finding sites. But unlike the other sites we also wanted to distinguish between different people with the same name.
System system was built upon Hadoop and HBase, exploiting the mechanisms behind HBase to generate a reverse index, document store and fair URL queueing system.
For more information
A presentation was given at the end of the project, the slides can be found here under the "Side Project" section.