Getting started Creating an application User interface
Documentation Basic text analysis Pattern search Creating a corpus
Advanced topics Custom properties
Basic text analysis
This page contains descriptions of the basic text analysis and search functions.
Most of the functions described below will produce different results depending on the mode. When ontology mode is active common words with little content are not included in the results. For example, prefixing cheese will find goat cheese, it will not show the cheese.will not show words like the, is, and on when ontology mode is active. Something similar also applies when searching for compounds. In ontology mode,
In language mode, all words and/or terms are included in the results.
Themenu provides access to options that summarize the frequencies of words, and subsets of words in the corpus.
The screendump illustrates the user interface after the user has selected. This resulted in all lemmas being displayed in the second browser. Thereafter, the user clicked on one of the lemmas (egg). The third browser then displays the inflections (e.g. the plural eggs) and the first browser the documents that contain the lemma egg.
Prefixing, infixing and postfixing (also called suffixing) are simple and powerful methods to find compound terms. In English, the general rule is that the last word of a compound determines what it is. For example, chocolate cake is a cake (and not a kind of chocolate).
Enter a term into the text entry field and then click on the prefix, infix or postfix icon. The results are shown in the third browser. Prefix looks before the term: prefixing cheese could result in compounds like goat cheese or fresh cheese. Postfix looks after the term: postfixing cheese results in cheese platter. Infix looks both before and after the term, it is hardly used.
When the third browser is showing a list of "fix" terms, a drag-and-drop from any browser results in the "fix" being applied to the term dropped.
The drop-down menu provides a Sub-string function which searches for all words that contain the sub-string provided in the text entry field. For example, with the sub-string xt, possible results are next and extra. The results are displayed in the second browser.
The special character ^ matches the beginning of a word and $ matches the end of a word. Thus, xt$ matches all words ending on xt. English is particularly regular with regard to endings, nearly any verb can be turned into a noun by suffixing the verb with er (work, worker; read, reader; etc.).
All five browsers contain a popup menu under the right mouse button. The options in these popups are described below.