TagCloudPlugin

28 October 2020 - 16:50 | Version 1 |
Renders a tag cloud given a list of terms

Description

This plugin helps rendering tag clouds. From Wikipedia:Tag_cloud

A Tag Cloud is a text-based depiction of tags across a body of content to show frequency of tag usage and enable topic browsing. In general, the more commonly used tags are displayed with a larger font or stronger emphasis. Each term in the tag cloud is a link to the collection of items that have that tag.

Tag clouds give a very quick overview of the distribution of terms within a document base. This can be used to support navigation or just to display the characteristics of the given data. Tag clouds are quite common in blog archives where you can click on a tag in the cloud and list all postings that are tagged that way. Tag clouds can be generated on any list of terms that don't need to be "tags" in the more stricter sense, though the name of the figure is still "tag cloud".

How tag clouds are computed

Computing the tag cloud is done by counting the occurrences of the terms in the input data.

First, the input data has to be tokenized and normalized to match each word in the input to a term by
  1. splitting up the data, removing special characters,
  2. mapping different terms on one term, like handling synonyms (e.g. wikiapps=WikiApplication) (optional)
  3. filtering unwanted words; there's a predefined list of (english) stop words that can applied
  4. mapping plural to singular word forms (done rudimentarily for english only)

Once all terms have been counted they are mapped into a fixed set of buckets. For example given 102 different terms have been counted and ordered by frequency, they are mapped into a set of -- let's say -- 30 buckets that are containing those terms. Each term is assigned a weight, that is the ID of the bucket it has been sorted in.

In general the tag cloud renders the more frequent terms bolder and/or more colorful than the less frequent. See the example below. But you are free to configure any variation of the appearance of a tag depending on its frequency.

Syntax Rules

Syntax:
%TAGCLOUD{(terms=)"<term-list>" ... }%

renders a tag cloud given a list of terms from which the term frequencies are extracted. There are a couple of options that infulence the way the list of terms is tokenized and processed, as well as the appearance of the resulting tag cloud.

Parameters:

Format strings (terms, header, format, sep and footer) might contain the following pseudo variables:

Example

Cloud for the above text:

You type:
%TAGCLOUD{"$percntINCLUDE{\"%WEB%.%TOPIC%\" section=\"exampletext\"}$percnt"
  header="<div style=\"text-align:center; padding:15px;line-height:180%\">"
  format="<span style=\"font-size:$weightpx;line-height:90%\"><a style=\"color:$fadeRGB(104,144,184,0,102,255);text-decoration:none\" title=\"$count\">$term</a></span>"
  footer="</div>"
  buckets="40"
  offset="12"
  lowercase="on"
  stopwords="on"
  plural="off"
  min="2"
  map="bucket=pail"
  filter="on"
}%

You get (faked): TagCloud

You get (if installed):
alpha alphabetically appearance character cloud color common count counted counting current custom data default different display displayed don't english example exclude expand expression field filter filtering footer form format frequency frequent general given group header included input integer list lowercase mapped mapping normalize normalized number occurrence offset output pail plural predefined pseudo regular render sep set singular sort sorted special split stop stopword string switch synonym syntax tag term tokenized undefined used value variable way weight word

Installation Instructions

You do not need to install anything in the browser to use this extension. The following instructions are for the administrator who installs the extension on the server.

Open configure, and open the "Extensions" section. "Extensions Operation and Maintenance" Tab → "Install, Update or Remove extensions" Tab. Click the "Search for Extensions" button. Enter part of the extension name or description and press search. Select the desired extension(s) and click install. If an extension is already installed, it will not show up in the search results.

You can also install from the shell by running the extension installer as the web server user: (Be sure to run as the webserver user, not as root!)
cd /path/to/foswiki
perl tools/extension_installer <NameOfExtension> install

If you have any problems, or if the extension isn't available in configure, then you can still install manually from the command-line. See https://foswiki.org/Support/ManuallyInstallingExtensions for more help.

Change History

28 Oct 2020: minor code clean up
01 Apr 2016: reverted old unicode fix no longer appropriate since Foswiki-2
15 May 2014: fixed filtering terms
01 Nov 2013: preserve sorting order of terms; fixed normalization to respect the min parameter; min defaults to 1 now; fixed unicode support
10 Dec 2009: added logarithmic normalization (Foswiki:Main/DanielOderbolz)
25 Nov 2009: fixed grouping in case sensitive tag clouds
03 Sep 2009: made sorting case insensitive; added case sensitive extra param
25 Aug 2009: added custom fields for terms in the tagcloud (Foswiki:Main.OliverKrueger)
24 Apr 2009: converted to foswiki plugin
07 Jan 2009: fixed parsing of parameters (tststs); certified for foswiki/compat
03 Jan 2008: added limit parameter; added sorting according to term frequency (count)
13 Sep 2007: don't remove numericals from terms
05 Jun 2007: better default values, e.g. filter is off by default now; fixed expansion of standard escapes
31 Aug 2006: added filter parameter to customize special chars to be excluded; added NO_PREFS_IN_TOPIC
10 Mar 2006: added grouping
07 Mar 2006: added escape chars to the term list parameter
03 Mar 2006: added warn parameter; fixed use of uninitialised value; if tags are sorted by weight tags of the same weight get sorted alphabetically now; sorting by weight is descending by default now
01 Mar 2006: added docu, added more pre-defined english stop words, added map and plural parameters, reworked order of tokenization. added fadeRGB format string variable
24 Feb 2006: Initial version