Term is a single word in a document. Term was created originally thinking they are atomic. Err, Just like daltons atomic theory atoms remain unmutated during a chemical reaction, but change during a nuclear reaction. Similarly, Term can be considered a basic building block of a document. Every term that is different in its binary […]
It is composed of a set of terms.
Bag of words is a simple modeling concept, where only the set of words matter. It simplifies the document for modeling purpose, by removing the order of words. Lets say, there is a document that has the following content. Taj Mahal Construction of the mausoleum was essentially completed in 1643 but work continued on other […]
Term frequency or TF as it is widely known as, is the number of times a given Term occurs in a document. It is one (old) way of measuring how much the Term is related to a document.