[NANOWEB DOCUMENTATION]

NANOWEB, the aEGiS PHP web server

mod_pfilters

The pfilters module allows you to pass the contents of delivered pages (static files as well as CGI output) through any of the internal filters or through an external filter programs available on your system.

This can be used for example to make all your html pages comply to the standard when they are actually requested. One of those external filter programs is »tidy«, the html validator from w3.org which can correct your pages on the fly by using the naoweb filter feature.

All the filters get available when the appropriate module is loaded (core filters within mod_pfilters, "gzip" filter is part of mod_gzip, and some extension filters are located in mod_html_pfilters). However except mod_gzip these filters are not active until you put a Filter directive referencing them into the configuration files:
 FilterEnable = 1
 Filter =        */*   null  -- you don't want to use this one!
 Filter =        */*   unchunk
 Filter =  text/html   pipe  /usr/bin/tidy -q -latin1
 Filter = text/xhtml   pipe  /usr/bin/tidy -xml -q
 Filter =  text/html   wap
 Filter =    image/*   wbmp 100x60
#Filter =        */*   gzip  -- this is added automagically by mod_gzip
Note: every filter is assigned to a mime-type, and it will only be executed if the current content matches this. This way you could bind »tidy« to html files and for example »watermark« to image files.

unchunk core Content from CGIs is often chunked, that means it is is splitted into many parts and thus cannot be passed to filters (like 'pipe') which require to work on the whole file. So you always should enable the unchunk filter which tries to resemble all chunks up to a given size (128K without argument). For example Filter = */* unchunk 300 would try to recombine chunks up to 300 kbytes.
pipe core The pipe filter is the most powerful of all the filters, as it is used to pass the current content through an external filter program, that is given as argument to this filter (you need to specify the full path name here, as it is checked fore existence initially).
Filter programs are very common utilities in a UNIX-like environment, but you probably want to use html-aware or xml-capable filters only.
BTW, the »pipe« filter is the one, the pfilters infrastructure was implemented for.
null core does absolutely nothing
(you could use /bin/cat to do equally nothing)
static core The server core now converts statically generated content into an internal parser object itself, so you really don't need to take care about this one!
gzip
(automagically)
mod_gzip Currently delivered content will be compressed on the fly, if you load mod_gzip into the server. This filter is activated automagically for all files, but you can however assign it in a Filter= directive like all the others (you however should take care, that it is the last one).
shrink html_filters This filter tries to remove all newlines from your html page, so it'll get to just one line. Beside the smaller size (also helps gzip) your page gets unreadable without appropriate tools and this makes it a small weapon against code sniffing.
Warning: this conversion does not harm your CSS areas (per specification of the w3c), but ElsewhatScript gets unusable if it contains comments; additonally this filter refuses to work if the file contains <PRE> tags.
downcase html_filters Converts all tags and their attributes (where missing quotation marks are added also) to lowercase which additonally helps compressing the file. This filter is rather slow.
wap
(experimental)
filters_html tries to convert your html to wml code; but this rule is of course only applied if the client actually requests wml.
The HTML should be valid, else the WML won't be too, so it is highly recommended to use a more sophistcated external conversion utility instead of this very slow internal one.
# best way to use the »wap« filter:
Filter = text/html pipe /usr/bin/tidy -q
Filter = text/html wap
Filter = application/vnd.wap.wml pipe /usr/bin/tidy -xml -q
garbage
(senseless)
html_filters this filter corrupts all your html files.
convert img_filters There exists a demo filter package for images; and the convert filter enables you to change the image type from png to jpeg and vice versa by specifying the target format after the filter name. Some other file formats may be supported (for input at last), but note that .gifs are rather rarely allowed and thus cannot be used with the img_filters module.
copyright img_filters You can use this filter to add a text snippet to served images; just add some text after the filter name, e.g.:
Filter = image/jpeg copyright (c) 2002 whoever 
wbmp img_filters The wbmp format is to be used for cellular phones` browsers, you probably want to enable this filter for images where you enabled the »wap« filter for hypertext files.


Note that you can disable all filters at once with the FilterEnable directive on a per-directory basis.

The name »pfilters« comes from the fact that it wraps a filter around the nanoweb internal »parser« objects which represent the requested files.

The module mod_gzip now uses the pfilters infrastructe as well, so there is no need to take care if the Filter rules are mentioned before mod_gzip, as this is ensured internally (most filters don't work on already compressed content).




NANOWEB, the aEGiS PHP web server

Index