readers Package

Module containing a collection of data readers for brain data.

htkcollection Module

Module used for reading of collections of HTK files of raw or processed neural recordings.

class brain.readers.htkcollection.HTKCollection(directory, prefix=None, layout=None, anatomy_file=None, bands_file=None, guess_bands=False, check_consistency=False)

Bases: object

Class for management of a directory of HTK files from raw or processed neural recordings. All HTK files are expected to have the same size.

Variables:
  • directory – Directory where the raw HTK data files are located
  • htk_files – Python list of strings of the paths to all HTK files
  • channel_to_file_map – 2D numpy array of shape (#blocks, #channels) indicating the index of the file associated with the corresponding channel.
  • file_to_channel_map – List of two-valued tuples indicating for each file the block and channel they are associated with. blockindex=self.file_to_channel_map[i][0].
  • data – 2D numpy array with the full data from all channels of None in case read_data() has not been called.
  • layout – Numpy array describing the physical layout of the grid. By default a rectangular layout is assumed with channels starting at the bottom right of the grid and channel numbers growing from bottom to top.
  • num_samples – Number of samples per channel
  • sample_period – Sample period in 100ns units
  • sample_rate – Sampling rate in KHz. This is the same as 10000/sample_period.
  • sample_size – Number of bytes per sample
  • parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)
  • anatomy – Dictionary describing for different regions of the brain the electrodes that are located in the given region.
  • dtype – Numpy dtype of the HTK data
  • bands – 1D numpy array with center of the frequency bands
_HTKCollection__check_consistency()

Internal helper function used to check that all HTK files in the collection have the same structure (i.e, whether the header information of the HTK files is the same for all files). NOTE! This function assumes that the list of htk_files has already been computed.

_HTKCollection__get_bands(bands_file=None, guess_bands=False)

Try to construct the bands from the filename. Note, this assumes that the metadata has already been constructed.

Parameters:
  • bands_file – Matlab file with the frequency bands. If not present, then the function will try to construct the frequency bands based on the filename.
  • guess_bands – If no bands file is given, should we guess the bands from the file-name.
static _HTKCollection__get_block_channel_index_from_name(filename)

Internal helper function used to determine the block index and channel index based on the name of the file.

Parameters:filename – Name of the HTK file
Returns:Integer of the block index and integer of the channel index within the block
_HTKCollection__get_htk_files()

Internal helper function used to compute the list of files (stored in self.htk_files) and the map of files to channels/blocks (stored in self.channel_block_map).

Returns:This function returns: i) a list of htk filenames, ii) a 2D numpy array of shape (#blocks, #channels) indicating the index of the file associated with a given channel, and iii) a list of tuples indicating for each file the block and channel index.
Raises :A ValueError is raised in case that HTK files of varying sizes are found.
_HTKCollection__get_htk_metadata()

Internal helper function used to retrieve the sampling rate, number of samples sample size, and parameter kind. NOTE! This function assumes that the list of htk_files has already been computed.

_HTKCollection__get_layout()

Internal helper function used to define the default layout of the brain grid. Note! This function assumes that the list of htk_files has already been computed.

static _HTKCollection__read_anatomy(anatomy_file)

Read .mat file describing the anatomy of the data and return a dict describing for different brain regions (keys) the set of electrodes that are located in that region (values, stored as numpy arrays).

Parameters:anatomy_file – The name of the .mat file with the description of the anatomy
static _HTKCollection__sort_files(filelist, blockindex, channelindex, numchannels)

Based on the blockindex and channelindex of the files, compute the linear order in which the files should be sorted.

Parameters:
  • filelist – List of all files
  • blockindex – Numpy array with the block index for each file.
  • channelindex – Numpy array with the channel index within each block for each file. channelindex must be the same length as blockindex
  • numchannels – Number of channels per block (usually channelindex.max())
__dict__ = <dictproxy object at 0x1116ad1a0>
__init__(directory, prefix=None, layout=None, anatomy_file=None, bands_file=None, guess_bands=False, check_consistency=False)

Initialize object for management of directory of RAW neural recording in HTK format.

Parameters:
  • directory – Directory with the raw HTK files
  • prefix – Optional prefix value valid HTK files must have.
  • layout – Array defining the layout of the electrodes. Set to None to use the default layout of m x m computed using the __get_layout function. E.g., the default layout for 16 electrodes is a 4x4 grid array with origin being located in the bottom right corner: ([[15, 11, 7, 3], [14, 10, 6, 2], [13, 9, 5, 1], [12, 8, 4, 0]])
  • anatomy_file – Optional file describing the anatomy of the electrodes (.mat file)
  • bands_file (String indicating the name of the .mat Matlab file.) – Optional file describing the center of the frequency bands in the neural recordings:
  • guess_bands (Boolean) – If no bands file is given, should we guess the bands from the file-name.
  • check_consistency (Boolean) – Check that all HTK files in the collection have the same structure.
Raises :

AssertionError is raised if check_consistency if enabled and inconsistencies in metadata are found between HTK files in the collection.

__module__ = 'brain.readers.htkcollection'
__weakref__

list of weak references to the object (if defined)

clear_data()

Clear the self.data instance variable to free up memory.

get_anatomy_dict()

Get the anatomy dicitionary describing for each region the list of electrodes in the region.

get_anatomy_map()

Get numpy array of string, indicating for each electrode the name of the region it is located in . ‘unknown’ is added for electrodes with an unknown region assignment.

get_block_index(fileindex)

Get the block index for the file with the given index.

Parameters:fileindex – Index of the file of interest
Returns:integer indicting the block index for the file.
get_channel_index(fileindex)

Get the channel index with a block for the file with the given index.

Parameters:fileindex – Index of the file of interest
Returns:integer indicting the channel index for the file.
get_number_of_blocks()

Get the number of blocks in which the all channels are organized.

get_number_of_channels_per_block()

Get the number of channels per block.

get_number_of_files()

Get the number of HTK files associated with the current collection of raw data.

Returns:Integer indicating the number of HTK files. (len(self.htk_files))
has_anatomy()

Check whether anatomy data is available for the collection.

read_channel(fileindex)

Get the data for the file with the given index.

read_data(print_status=False)

Read all data from file and return the numpy array. This function modifies self.data to safe the data retrieved.

Parameters:print_status – Print status message on read progress on screen. Default is False.

htkfile Module

Module used for reading of htk files.

class brain.readers.htkfile.HTKFile(filename)

Bases: object

Class used for reading HTK format files.

Instance Variables:

Variables:
  • filename – Name of the HTK file
  • data – Numpy array of the data or None in case that read_data has not been called yet
  • num_samples – Number of samples in the file
  • sample_period – Sample period in 100ns units
  • sample_rate – Sampling rate in KHz. This is the same as 10000/sample_period.
  • sample_size – Number of bytes per sample
  • parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)
  • dtype – Data type
  • vector_length – Vector length
  • A – Compression parameter
  • B – Compression parameter
  • header_length – Total header length

Internal Variables:

Variables:
  • __file – The handle to the HTK file
  • __current_pos – Internal variable used to store the current sample position during iteration
static _HTKFile__is_big_endian()

Simple helper function used to determine whether the system is little or big endianess. This is needed to determine whether the binary data read needs to be swapped or not.

Returns:Boolean indicating whether the data is big or little endian.
_HTKFile__seek_sample(sample_index=0)

Internal helper function used to position the file handle at the position of the sample with the given index.

_HTKFile__swap_required()

Method use to check whether we need to swap the byteorder of the binary data read.

Returns:Boolean indicating if the byteorder needs to be swapped.
__dict__ = <dictproxy object at 0x1115efad0>
__init__(filename)
__iter__()

Make the HTKFile iterable

__module__ = 'brain.readers.htkfile'
__weakref__

list of weak references to the object (if defined)

next()

Get the next item for iteration

read_data()

Get a numpy data array of all the samples

Returns:Numpy data array of all the samples
read_sample(sample_index)

Read the data of a single sample with the given index.

Parameters:sample_index – The index of the sample to be read
Returns:The vector with the data for the sample.
class brain.readers.htkfile.HTKFormat

Bases: object

Specification of base information about the HTK file format.

__dict__ = <dictproxy object at 0x1115ef440>
__module__ = 'brain.readers.htkfile'
__weakref__

list of weak references to the object (if defined)

byte_order = '>'

Byte-order in which the HTK data is written

header = [{'description': 'Number of samples in the File', 'name': 'num_samples', 'format': 'I'}, {'description': 'Sample period in 100ns units', 'name': 'sample_period', 'format': 'I'}, {'description': 'Number of bytes per sample', 'name': 'sample_size', 'format': 'H'}, {'description': 'A code indicating the sample kind', 'name': 'parameter_kind', 'format': 'H'}]

List describing the contents of the HTK file header.

classmethod header_format()

Get the format string to unpack the header of the HTK file.

:returns string—e.g. ‘>IIHH’—describing the format to be used for unpacking the header.

header_length = 12

Total length in bytes of the file header.

param_kind_base = {'MELSPEC': 8, 'IREFC': 5, 'USER': 9, 'FBANK': 7, 'LPCDELCEP': 4, 'LPC': 1, 'LPCEPSTRA': 3, 'DISCRETE': 10, 'MFCC': 6, 'WAVEFORM': 0, 'LPCREFC': 2}

Dictionary describing the basic parameter kind codes.

  • WAVEFORM = 0 : sampled waveform
  • LPC = 1 : linear prediction filter coefficients
  • LPCREFC = 2 : linear prediction reflection coefficients
  • LPCEPSTRA = 3 : LPC cepstral coefficients
  • LPCDELCEP = 4 : LPC cepstra plus delta coefficients
  • IREFC = 5 : LPC reflection coefficient in 16 bit integer format
  • MFCC = 6 : mel-frequency cepstral coefficients
  • FBANK = 7 : log mel-filter bank channel outputs
  • MELSPEC = 8 : linear mel-filter bank channel outputs
  • USER = 9 : user-defined sample kind
  • DISCRETE = 10 : vector quantised data
param_kind_encoding = {'_O': 8192, '_N': 128, '_K': 4096, '_Z': 2048, '_E': 64, '_D': 256, '_C': 1024, '_A': 512}

Dictionary describing the parameter kind encodings.

  • _E = 0000100 : has energy
  • _N = 0000200 : absolute energy suppressed
  • _D = 0000400 : has delta coefficients
  • _A = 0001000 : has acceleration (delta-delta) coefficients
  • _C = 0002000 : is compressed
  • _Z = 0004000 : has zero mean static coefficients
  • _K = 0010000 : has CRC checksum
  • _O = 0020000 : has 0th cepstral coefficient

Table Of Contents

Previous topic

dataformat Package

Next topic

tools Package

This Page