Inverted index in Scala
07 Dec 2015Last week I had to build an inverted index to speed up (a lot) a program doing parallel document identification. And I was quite impressed by how simple it was in Scala!
Inverted index?
It’s simple, let’s say you have this index:
For each line you have a document identifier followed by the words appearing in the document. Here, word1 is in document0.txt, but not in document1.txt.
And you want to turn this index into:
For each word you have the documents in which it appears. Here, word0 appears in document0.txt and document1.txt.
This is... Read more