sourCEntral - mobile manpages

pdf

dspam_train

NAME

dspam_train - train a corpus of mail

SYNOPSIS

dspam_trainusername ] [ spam_dir ] [ nonspam_dir ]

DESCRIPTION

dspam_train is used to train and test a corpus of mail (in maildir format). This tool will present each message to dspam for a classification and then retrain only if the message was incorrect. This provides close to real-world training and should be used to build pretrained databases. Upon execution, the tool will automatically determine the ratio of spam:nonspam and train based on that ratio to ensure both corpora are trained consecutively. This tool can also be used as a test jig to measure the efficiency and accuracy of a particular corpus against dspam in a given configuration.

OPTIONS

[username]

Specifies the user to train.

[spam_dir]

Specifies the pathname to the directory containing the corpus of spam. Each message should be separate in its own file.

[nonspam_dir]

Specifies the pathname to the directory containing the corpus of nonspam. Each message should be separate in its own file.

EXIT VALUE

0

Operation was successful.

other

Operation resulted in an error.

AUTHORS

Jonathan A. Zdziarski

For more information, see http://www.nuclearelephant.com.

SEE ALSO

dspam(1), dspam_stats(1), dspam_clean(1), dspam_dump(1), dspam_merge(1)

pdf