Fsdb::Filter - base class for Fsdb filters
Fsdb::Filter is the virtual base class for Fsdb filters.
Users will typically invoke individual programs via the command line (for example, see dbcol(1)) or string together several in a Perl program as described in dbpipeline(3).
For new Filter developers, internal processing is:
new
set_defaults
parse_options
autorun if desired
parse_options # optionally called additional times
setup # does IO on header
run # does IO on data
finish # any shutdown
In addition, the "info" method returns metadata about a given filter.
new
$fsdb = new Fsdb::Filter;
Create a new filter object, calling set_defaults and parse_options. A user program will call a specific filter (say Fsdb::Filter::dbcol) to do processing. See also dbpipeline for aliases that remove the wordiness.
post_new
$filter->post_new();
Called when the subclass is done with new, giving Fsdb::Filter a chance to autorun.
set_defaults
$filter->set_defaults();
Set up object defaults. Called once during new.
Fsdb::Filter::set_defaults does some general setup, tracking module invocation and preparing for one input and output stream.
set_default_tmpdir
$filter->set_default_tmpdir
Figure out a tmpdir, from environment variables if necessary.
parse_options
$filter->parse_options(@ARGV);
Parse_options is called one or more times to parse ARGV-style options. It should not do any IO or any irreverssable actions; defer those to startup.
Fsdb::Filter::parse_options does no work; the subclass is expected to call Fsdb::Filter::get_options() for all arguments.
Most modules implement certain common fsdb options, listed below.
This module also supports the standard fsdb options:
-d |
Enable debugging output. |
-i or --input InputSource
Read from InputSource, typically a file name, or "-" for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
-o or --output OutputDestination
Write to OutputDestination, typically a file name, or "-" for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
--autorun or --noautorun
By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The "--(no)autorun" option controls that behavior within Perl.
--noclose
By default, programs close their output when done. For some cases where objects are used internally, "--noclose" may be used to leave output open for further I/O. (This option is only supported by some filters.)
--saveoutput $OUT_REF
By default, programs close their output when done. With this option, programs in Perl can have a subprogram create an output refrence and return it to the caller in $OUT_REF. The caller can then use it for further I/O. (This option is only supported by some filters.)
--help
Show help.
--man
Show full manual.
parse_target_column
$self->parse_target_column(\@argv);
A helper function: allow one column to be specified as the "_target_column".
get_options
$success = $filter->get_options(\@argv, "v+" => \$verbose, ...)
get_options is just like Getopt::Long’s GetOptions, but takes the argument list as the first argument. This list is modified and any non-options are returned. It also saves _orig_argv in itself.
parse_sort_option
$fsdb_io = $filter->parse_sort_option($option_name, $target);
This helper function handles sorting options and column names as described in dbsort(1). We normalize long sort options to unbundled short options and accumulate them in $self->{_sort_argv}.
parse_io_option
$fsdb_io = $filter->parse_io_option($io_direction, $option_name, $target);
This helper function handles "--input" or "--output" options, without doing any setup.
It fills in $self->{_$IO_DIRECTION} with the resulting object, which is either a file handle or Fsdb::Filter::Piepline object, and expects "finish_io_option" to convert this token into a full Fsdb::IO object.
$IO_DIRECTION is usually input or output, but it can also be inputs (with an "s") when multiple input sources are allowed.
finish_one_io_option
$fsdb_io = $filter->finish_io_option($io_direction, $token, @fsdb_args);
This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using $TOKEN as information. using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. It returns the $FSDB_IO option.
$IO_DIRECTION must be "input" or "output".
Since it does IO, finish_io_option should only be called from setup, not parse_options.
Can be called once per IO stream.
finish_io_option
$filter->finish_io_option($io_direction, @fsdb_args);
This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. the resulting Fsdb::IO objects are built from "$self-"{_$IO_DIRECTION}> and are left in "$self-"{_in}> or ("_out" or @_ins).
$IO_DIRECTION must be "input", "inputs" or "output".
Since it does IO, finish_io_option should only be called from setup, not parse_options.
Can be called once per IO stream.
No return value.
direction_to_stdio
$fh = direction_to_stdio($direction)
Private internal routing. Give a filehandle for STDIN or STDOUT based on $DIRECTION == ’input or ’output’
finish_fh_io_option
$filter->finish_fh_io_option($io_direction);
This helper function creates a filehandle in $IO_DIRECTION. Compare to finish_io_option which creates a Fsdb::IO object. It creates the actual IO::File objects, opens the files (or whatever). The filehandle is built from "$self-"{_$IO_DIRECTION}> and are left in "$self-"{_in}> or ("_out").
$IO_DIRECTION must be "input" or "output".
This function does no IO.
No return value.
setup
$filter->setup();
Do any setup that requires minimal IO (for example, reading and parsing headers).
Called exactly once.
run
$filter->run();
Execute the body, typically iterating over the input rows.
Called exactly once.
compute_program_log
$log = $filter->figure_program_log();
Compute and return the log entry for a program.
finish
$filter->finish();
Write out any trailing comments and close output.
setup_run_finish
$filter->setup_run_finish();
Shorthand for doing everything needed to run a command straightaway.
info
$filter->info($INFOTYPE)
Return information about what the filter does. Infotypes:
input_type Types of input accepted. Raw types are: "fsdbtext",
"fsdbobj", "fsdb*", "text", or "none".
output_type Type of output produced. Same format as input_type.
input_count Number of input streams (usually 1).
output_count Number of input streams (usually 1).
Filter has some class-specific utility routines in it. (I.e., they know about $self.)
create_pass_comments_sub
$filter->create_pass_comments_sub
or
$filter->create_pass_comments_sub('_VALUE');
Creates a code block suitable for passing to "Fsdb::IO::Readers" "-comment_handler" that passes comments through to "$self-"{_out}>. Or with the optional argument, through "$self-"{_VALUE}>.
create_tolerant_pass_comments_sub
$filter->create_tolerant_pass_comments_sub
or
$filter->create_tolerant_pass_comments_sub('_VALUE');
Like "$self-"create_pass_comments_sub>, but this version tolerates the output not being opened. In those cases, comments are discarded. Warning: use carefully to guarantee consistent results.
A symptom requiring tolerance is to get an error like "Can’t call method "write_raw" on an undefined value at /usr/lib/perl5/vendor_perl/5.10.0/Fsdb/Filter.pm line 678." (which will be the sub create_pass_comments_sub ($;$) line in create_pass_comments.)
create_delay_comments_sub
$filter->create_delay_comments_sub($optional_value);
Creates a code block suitable for passing to Fsdb::IO::Readers -comment_handler that will buffer comments for automatic (from $self->final) after all other IO. No output occurs until finish() is called, at which time "$self->{_out}" must be a live Fsdb object.
create_compare_code
$filter->create_compare_code($a_fsdb, $b_fsdb, $a_fref_name, $b_fref_name).
Write compare code based on sort-style options stored in "$self-"{_sort_argv}>. $A_FSDB and $B_FSDB are the Fsdb::IO object that defines the schemas for the two objects. We assume the variables $a and $b point to arefs; these names can be overridden by specifying $A_FREF_NAME and $B_FREF_NAME.
Returns undef if there are no fields in "$self-"{_sort_argv}>.
numeric_formatting
$out = $self->numeric_formatting($x)
Display a floating point number $x using $self->{_format}, handling possible non-numeric "-" as a special case.
setup_exactly_two_inputs
$self->setup_exactly_two_inputs
Ensure that there are exactly two input streams. Common to dbmerge and dbjoin.
Filter also has some utility routines that are not part of the class structure. They are not exported.
(none currently)