Over the Christmas period while in between things I have been
working on a new datatype for the excellent uComponents, I
have named the datatype Similarity.
Similarity is another content picker. The question arises we
already have plenty of content pickers why do we need another? Well
all the content picker pickers to date work on the principle of
strong linking namely user picks content or media from the tree an
item they would like to link to current content. Similarity
works on the principle of conceptual linking.
In conceptual linking the content of the current document is
used to find other content that is like the current document.
The datatype type works in the following way:
You configure the datatype to point to a Lucene index (must be
content index not pdf). Then you set the fields that you wish to
compare (Can select more than one field however only fields of type
textstring, text multiple and rich text editor are presented for
selection).

Add the datatype to a document type. When the type renders you
click on find similar and a Lucene more like this search is performed.
The query returns documents that are similar to the current one
using the fields specified in the datatype settings. You can
then select and sort the suggested documents.

The datatype will only really work with a large content rich
site good examples of content rich sites are public sector
websites.
There are a few things that I need to do with this datatype
including adding an xslt extension method so that you can use the
more like this query in your xslts. I am hoping Similarity will be
part of the next release of uComponents.