Inverted index in Scala
07 Dec 2015Last week I had to build an inverted index to speed up (a lot) a program doing parallel document identification. And I was quite impressed by how simple it was in Scala!
Inverted index?
It’s simple, let’s say you have this index:
For each line you have a document identifier followed by the words appearing in the document. Here, word1
is in document0.txt
, but not in document1.txt
.
And you want to turn this index into:
For each word you have the documents in which it appears. Here, word0
appears in document0.txt
and document1.txt
.
This is... Read more