We want to talk about categorizing documents.
Let $X$ be a set of documents. To be concrete, for $n \in \N $, $X$ may be all length $n$ character sequences in some alphabet of digital symbols.