ch_track Track file manipulation


ch_track [input file] -o [output file] [options] [-h ] [-itype string] [-ctype string] [-s float] [-c string] [-start float] [-end float] [-from int] [-to int] [-otype string " {ascii}"] [-S float] [-o ofile] [-info ] [-track_names string] [-diff ] [-delta int] [-sm float] [-smtype string] [-style string] [-t float] [-neg string] [-pos string] [-pc string]

ch_track is used to manipulate the format of a track file. Operations include:



Options help


string Input file type (optional). If no type is specified type is automatically derived from file's header. Supported types are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_user, htk_discrete, xmg, xgraph, ema, ema_swapped, ascii


string Contour type: F0, track


float Frame spacing of input in seconds, for unheadered input file


string Select a subset of channels (starts from 0). Tracks can have multiple channels. This option specifies a list of numbers, refering to the channel numbers which are to be used for for processing.


float Extract track starting at this time, specified in seconds


float Extract track ending at this time, specified in seconds


int Extract track starting at this frame position


int Extract track ending at this frame position


string " {ascii}" Output file type, if unspecified ascii is assumed, types are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_user, htk_discrete, xmg, xgraph, ema, ema_swapped, ascii, label


float Frame spacing of output in seconds. If this is different from the internal spacing, the contour is resampled at this spacing


ofile Output filename, defaults to stdout


Print information about file and header. This option gives useful information such as file length, file type, channel names. No output is produced


string File containing new names for output channels


Differentiate contour. This performs simple numerical differentiation on the contour by subtracting the amplitude of the current frame from the amplitude of the next. Although quick, this technique is crude and not recommende as the estimation of the derivate is done on only one point


int Make delta coefficients (better form of differentiate). The argument to this option is the regression length of of the delta calculation and can be between 2 and 4


float Length of smoothing window in seconds. Various types of smoothing are available for tracks. This options specifies length of the smooting window which effects the degree of smoothing, i.e. a longer value means more smoothing


string Smooth type, median or mean


string Convert track to other form. Currently only one form "label" is supported. This uses a specified cut off to make a label file, with two labels, one for above the cut off (-pos) and one for below (-neg)


float threshold for track to label conversion


string Name of negative label in track to label conversion


string Name of positive label in track to label conversion


string Combine given tracks in parallel. If option is longest, pad shorter tracks to longest, else if first pad/cut to match first input track Available track file formats: none unknown track file type esps entropic sps file est Edinburgh Speech Tools track file est_binary Edinburgh Speech Tools track file htk htk file htk_fbank htk file (as FBANK) htk_mfcc htk file (as MFCC) htk_user htk file (as USER) htk_discretehtk file (as DISCRETE) xmg xmg file viewer xgraph xgraph display program format ema ema ema_swapped ema, swapped ascii ascii decimal numbers

Making multiple tracks into a single track

If multiple input files are specified, by default they are concatenated into the output file.

$ ch_track -o

In the above example, 4 multi channel input files are converted to one single channel output file. Multi-channel tracks can concatenated provided they all have the same number of input channels.

Multiple input files can be made into a multi-channel output file by using the -pc option:

$ ch_track -o -pc longest

The argument to -pc can either be longest, in which the output track is the length of the longest input file, or first in which it is the length of the first intput file.

Extracting channels from multi-channel tracks

The -c option is used to specify channels which should be extracted from the input. If the input is a 4 channel track,

$ ch_track -o -c "0 2"

will extract the 0th and 2nd channel (counting starts from 0). The argument to -c can be either a single number of a list of numbers (wrapped in quotes).

Extracting of a single region from a track

There are several ways of extracting a region of a track. The simplest way is by using the start, end, to and from commands to delimit a sub portion of the input track. For example

$ ch_track -o -start 1.45 -end 1.768

extracts a subtrack starting at 1.45 seconds and extending to 1.768 seconds. alternatively,

$ ch_track -o -from 50 -to 100

extracts a subtrack starting at 50 frames and extending to 100 frames. Times and frames can be mixed in sub-track extraction. The output track will have the same number of channels as the input track.

Adding headers and format conversion

It is usually a good idea for all track files to have headers as this way different files can be handled safely. ch_track provides a means of adding headers to unheadered files. These files are assumed to be ascii floats with one channel per line. The following adds a header to an ascii file.

$ ch_track kdt_010.atr -o -otype est -s 0.01

ch_track can change the frame shift of a fixed frame file, or convert a variable frame shift file into a fixed frame shift. At present this is done with a very crude resampling technique and hence the output file may suffer from anti-aliasing distortion.

Change to a frame spacing of 0.02 seconds:

$ ch_track -o kdt_010.tr2 -S 0.02