sourCEntral - mobile manpages

pdf

CSTOCS

NAME

cstocs −− charset encoding convertor for the Czech and Slovak languages.

FORMAT

        cstocs [options] src_encoding dst_encoding [files ...]

SYNOPSIS

        cstocs il2 ascii < file | less
        cstocs −i utf8 il2 file1 file2 file3
        cstocs −−help

DESCRIPTION

Cstocs is a simple conversion utility to change charset encoding of a text. It reads either specified files or (if none specified) the standard input, assumes that the input is encoded in "src_encoding" and ties to reencode it into "dst_encoding". The result is written to the standard output.

Run "cstocs" without parameters to get short help and list of available encodings.

Characters that are not defined in "src_encoding" are passed to the output unchanged.

If source text contains character, that is defined in "src_encoding" but not in "dst_encoding", it can be handled several ways. For example, character "e with caron" (symbol ecaron), and "d with caron" (symbol dcaron) are included in the iso−8859−2 encoding, but not in the iso−8859−1. If you will do reencoding of 8859−2 text to 8859−1, you may want to do one of the following actions:

1.

Keep it the same, option "−−nofillstring".

2.

Do not produce any output instead of "ecaron" symbol, option "−−null".

3.

Substitute some string (possibly a space) instead of both ecaron and dcaron, options "−−fillstring".

4.

Substitute a letter "d" instead of dcaron, and "e" instead of ecaron. It is even possible to substitute string instead of symbol, so you can replace the " AE " Latin character with string " AE " (letter "A", and letter "E"). Or you can replace a "plusminus sign" with a string "+/−". These substitutions are described in the accent file.

OPTIONS

−i, −i.ext, −−inplace.ext

Files specified will be converted in-place, using Perl "−i" facility. Optionaly, an extension for backup copies may be specified after dot. This parameter has to be the first one, if specified.

−−dir directory

Encoding files are taken from directory instead of the default, which is Cz/Cstocs/enc in the Perl lib tree. The location of encoding files can also be changed using the CSTOCSDIR environment variable, but the −−dir option has the highest priority.

−−fillstring string

If source text contains character, that is defined in the "src_encoding" but not in the "dst_encoding" nor in the accent file (or accent file is not used), it is replaced by "string". The default is single space.

−−nofillstring

Disable changes of characters that would otherwise have fillstring applied. This is different from "−−null" because that cancels that character out.

−−null

Completely equivalent to −−fillstring "".

−−nochange or −−noaccent

Do not use the accent file at all.

−−onebyone

Use only those rules from the accent file, which rewrite one character to one character. If this option is specified, character "ecaron" will be rewritten to "e", but " AE " character will not be rewritten to " AE " string.

−−onebymore

Use all rules from accent file. This is the default option.

SEE ALSO

Cz::Cstocs(3).

AUTHOR

Jan "Yenya" Kasprzak has done the original Un*x implementation.

Jan Pazdziora, adelton AT fi DOT muni DOT cz, created the Perl module version.

pdf