[NANOWEB DOCUMENTATION]

NANOWEB, the aEGiS PHP web server

mod_pfilters

The pfilters module allows you to pass the contents of delivered pages (static files as well as CGI output) through any of the internal filters or through any external filter program available on your system.

All the filters get available when the appropriate module is loaded (core filters within mod_pfilters, "gzip" filter is part of mod_gzip, and some extension filters are located in mod_html_pfilters). However except mod_gzip these filters are not active until you put a Filter directive referencing them into the configuration files:
 FilterEnable = 1
 Filter =        */*   null  -- you don't want to use this one!
 Filter =        */*   unchunk
 Filter =  text/html   pipe  /usr/bin/tidy -q -latin1
 Filter = text/xhtml   pipe  /usr/bin/tidy -xml -q
 Filter =  text/html   wap
 Filter =    image/*   wbmp 100x60
#Filter =        */*   gzip  -- this is added automagically by mod_gzip
Note: every filter is assigned to a mime-type, and it will only be executed if the current content matches this. This way you could bind »tidy« to html files and for example »watermark« to image files.

Besides mime-types you can also use file extensions to make a filter rule. Finally both can be combined:
 Filter = .html        null
 Filter = .exe         pipe /usr/local/f-prot/f-prot -ai -
 Filter = .txt|.pdf    null
 Filter = pdf|image/*  null

These are the currently available filters:

unchunk core Content from CGIs is often chunked, that means it is is splitted into many parts and thus cannot be passed to filters (like 'pipe') which require to work on the whole file. So you always should enable the unchunk filter which tries to resemble all chunks up to a given size (128K without argument). For example Filter = */* unchunk 300 would try to recombine chunks up to 300 kbytes.
pipe core The pipe filter is the most powerful of all the filters, as it is used to pass the current content through an external filter program, that is given as argument to this filter (you need to specify the full path name here, as it is prior checked for existence).
Filter programs are very common utilities in a UNIX-like environment, but you probably want to use html-aware or xml-capable filters only.
BTW, the »pipe« filter is the one, the pfilters infrastructure was implemented for.
null core does absolutely nothing
(you could use /bin/cat to do equally nothing)
static core The server core now converts statically generated content into an internal parser object itself, so you really don't need to take care about this one!
 
gzip
(automagically)
mod_gzip Currently delivered content will be compressed on the fly, if you load mod_gzip into the server. This filter is activated automagically for all files, but you can however assign it in a Filter= directive like all the others (you however should take care, that it is the last one).
 
shrink html_filters This filter tries to remove all newlines from your html page, so it'll get to just one line. Beside the smaller size (also helps gzip) your page gets unreadable without appropriate tools and this makes it a small weapon against code sniffing.
Warning: this conversion does not harm your CSS areas (per specification of the w3c), but ElsewhatScript gets unusable if it contains comments; additonally this filter refuses to work if the file contains <PRE> tags.
downcase html_filters Converts all tags and their attributes (where missing quotation marks are added also) to lowercase which additonally helps compressing the file. This filter is rather slow.
wap
(experimental)
html_filters tries to convert your html to wml code; but this rule is of course only applied if the client actually requests wml.
The HTML should be valid, else the WML won't be too, so it is highly recommended to use a more sophistcated external conversion utility instead of this very slow internal one.
# best way to use the »wap« filter:
Filter = text/html pipe /usr/bin/tidy -q
Filter = text/html wap
Filter = application/vnd.wap.wml pipe /usr/bin/tidy -xml -q
garbage
(senseless)
html_filters this filter corrupts all your html files.
 
convert img_filters There exists a demo filter package for images; and the convert filter enables you to change the image type from png to jpeg and vice versa by specifying the target format after the filter name. Some other file formats may be supported (for input at last), but note that .gifs are rather rarely allowed and thus cannot be used with the img_filters module.
copyright img_filters You can use this filter to add a text snippet to served images; just add some text after the filter name, e.g.:
Filter = image/jpeg copyright (c) 2002 whoever 
wbmp img_filters The wbmp format is to be used for cellular phones` browsers, you probably want to enable this filter for images where you enabled the »wap« filter for hypertext files.
 
error misc_filters this filter enables you to prevent certain files from being delivered, it doesn't output a real http error response but you may give an response code as argument to this filter
addservervar misc_filters you can setup server/environment variables with this filter for some selected files / mime types (for CGI scripts for example)
addheader misc_filters Allows you to output an additonal arbitrary HTTP response header together with selected files, or files of a specified mime type.
nocache misc_filters This filter applied to any file prevents it from getting cached by proxies, it does so by just adding the according HTTP response headers.
handler misc_filters This filter corresponds to the apache »AddHandler« directive and allows you to get specified files processed by the CGI script given as argument to this filter.
Filter = .myhtm|.txt /cgi-bin/needs-frame.php
In this example files with the extension .myhtm or .txt would invoke the given php script with PATH_INFO set with the filename of the originally requested .myhtm or .txt file. So this script (handler) must take care to produce some output out of the file (build some html table around loaded plain text file, or such things).


Note that you can disable all filters at once with the FilterEnable directive on a per-directory basis.

The name »pfilters« comes from the fact that it wraps a filter around the nanoweb internal »parser« objects which represent the requested files.

The module mod_gzip now uses the pfilters infrastructe as well, so there is no need to take care if the Filter rules are mentioned before mod_gzip, as this is ensured internally (most filters don't work on already compressed content).




NANOWEB, the aEGiS PHP web server