extractText(obj, options)

Extracts text from a given Obj and returns it as a string.

extractText provides the pure-text version of those Obj’s attributes that are specified as extractTextAttributes in its model class. To achieve this, the function removes HTML tags and newlines from the attribute values. Here are some use cases:

  • Displaying a preview snippet, for example the first 300 characters of a
    • page in a search results list,
    • blog post in a blog post overview,
    • text preview of a PDF file (e.g. in a search results list).
  • Providing metadata for a page, for example by
    • using extracted text in og:description or twitter:description meta tags in the header
    • using widgets as a content source in a Schema.org JobPosting.
  • Calculating the estimated reading time of a blog post based on the word count.


  • obj (Obj) – The Obj instance from which text should be extracted.
  • options (Object):
    • length (Number) – The maximum length of the return value. Limiting the length to a reasonable value (e.g. 300 characters) may speed up the text extraction process. Default: 1,000,000,000


String – the values of the Obj’s extractTextAttributes as a single string, stripped of HTML tags and newlines.


This method is loadable, meaning that it is able to return partial results and indicate to Scrivito.load or Scrivito.connect that it needs to be executed again at a later point in time.

Attributes such as title should not be included in the extractTextAttributes list because search result lists or blog post overviews most likely display the individual titles anyway.

See also


Prepare text extraction from instances of a simple Page class:

Extract text from a simple HeadlineWidget contained in a widgetlist attribute:

Extract text from several attributes of a widget:

Make use of the length option:

Extract text from several widgets:

Extract text from an attribute of the html type:

Extract text from a PDF file and limit its length to the first 100 characters: