sourCEntral - mobile manpages

pdf

Fsdb::IO::Reader

NAME

Fsdb::IO::Reader - handle formatting reading from a fsdb file (handle) or queue

SAMPLE CODE

Sample code reading an input stream:

$in = new Fsdb::IO::Reader(-file => '-');
$in->error and die "cannot open stdin as fsdb: " . $in->error . "\n";
my @arow;
while ($in->read_row_to_aref(\@arow) {
# do something
};
$in->close;

METHODS

new
$fsdb = new Fsdb::IO::Reader(-file => $filename);
$fsdb = new Fsdb::IO::Reader(-header => "#fsdb -F t foo bar", -fh => $file_handle);

Creates a new reader object from FILENAME. ( FILENAME can also be a IO::Handle object, or an hdfs: file.) Always succeeds, but check the "error" method to test for failure.

Options:
other options
See also the options in Fsdb::IO, including "-file",
"-header".
-file FILENAME
Open and read the given filename. Special filename "-"
is standard input, and files with hdfs: are read from Hadoop (but not
with directory aggregation).
-comment_handler $ref

Define how comments are handled. If $REF is a Fsdb::IO::Writer object, comments are written to that stream as they are encountered. if $REF is a ref to a scalar, then we assume that scalar will be filled in with a Fsdb::IO::Writer object later and treat it the same. If it is of type code, then it is assumed to be a callback function of the form:

sub comment_handler ($) { my $comment = @_; }

where the one argument will be a string with the unparsed comment (with leading # and trailing newline).

By default, or if $ref is undef, comments are consumed.

A typical handler if you have an output Fsdb stream is:

sub { $out->write_raw(@_); };

(That is the code created by Fsdb::Filter::create_pass_comments_sub.)

There are several support routines to handle comments in a pipeline; see Fsdb::Filter::create_pass_comments_sub, Fsdb::Filter::create_tolerant_pass_comments_sub, Fsdb::Filter::create_delay_comments_sub.

User-specified -header arguments override a header provided in the input source.

config_one
documented in new

comment_handler_to_sub;
internal use only: parses and sets up the comment handle callback. (At input, _comment_sub is as given by -comment_handler, but at exit it is always an anon function.

_enable_compression
$self->_enable_compression

internal use only: switch from uncompressed to compressed.

create_io_subs
$self->create_io_subs()

internal use only: create a thunk that returns rowobjs.

read_headerrow
internal use only; reads the header

read_rowobj
$rowobj = $fsdb->read_rowobj;

Reads a line of input and returns a "row object", either a scalar string for a comment or header, or an array reference for a row, or undef on end-of-stream. This routine is the fastest way to do full-featured fsdb-formatted IO. (Although see also Fsdb::Reader::fastpath_sub.)

Unlike all the other routines (including fastpath_sub), read_rowobj does not do comment processing (calling comment_sub).

read_row_to_aref
$fsdb->read_row_to_aref(\@a);

Then $a[0] is the 0th column, etc. Returns undef if the read fails, typically due to EOF.

unread_rowobj
$fsdb->unread_rowobj($fref)

Put an fref back into the stream.

unread_row_from_aref
$fsdb->unread_row_from_aref(\@a);

Put array @a back into the file.

read_row_to_href
$fsdb->read_row_to_href(\%h);

Read the next row into hash %h. Then $h{’colname’} is the value of that column. Returns undef if the read fails, typically due to EOF.

unread_row_from_href
$fsdb->unread_row_from_href(\%h);

Put hash %h back into the file.

fastpath_ok
$fsdb->fastpath_ok();

Check if we can do fast-path IO (post-header, no pending unread rows, no errors).

fastpath_sub
$sub = $fsdb->fastpath_sub()
$row_aref = &$sub();

Return an anonymous sub that does read fast-path when called. This code stub returns a new $aref corresponding with a data line, and handles comments as specified by -comment_handler

pdf