4.2.3. hic

This module contains classes used to work with Hi-C data.

class hic.HiC(dirname)

Bases: object

HiC represents an Hi-C experiment, handling the matrices reading, and position fetching.

The experiment is encapsulated into a directory, with one file per matrix. The matrices are written as gzipped TSV files with 3 columns: row position, column position and value. The matrices are sparse.

There is also a JSON file called metadata.json of the following form:

{
  "Binsize": 5000,
  "Assembly": "droYak2",
  "Species": "Drosophila yakuba",
  "Comment": "nm_none - No NaN",
  "Date": "",
  "Dataset": "Dyak_c",
  "Dims": {
    "2R|2R": [1234, 1234],
    "4|X": [345, 2345],
    ...
  }
  "MapFiles": {
    "2R|2R": "2R_2R.tsv.gz",
    "4|X": "4_X.tsv.gz",
    ...
  }
}

Please note that the format of the file names is free, but the one of the keys is not. This is of the form chromosome|chromosome.

Created:May 2018
Last modified:August 2018
binsize = None

The binsize of the dataset.

chromosomes = None

The list of chromosomes for which data are available.

current = None

The currently loaded Hi-C map.

get_contact(g1, g2)

Fetch the contact between the genes g1 and g2. They are dict of the same kind as returned in the genes. If the genes are not on the same chromosomes as the currently loaded heatmap, an exception is raised. If at least one of them is outbound, Not-a-Number (NaN) is returned.

inter = None

True iff the dataset contains inter-chromosomal contacts.

load_all_maps()

Load all the matrices. The result is an array with all the values.

Warning

This function can need a lot of memory.

Warning

This function’s result may change in the future.

load_map(rowChrom, colChrom, scramble=False)

Load the wanted matrix. If needed, the row and column chromosomes will be swapped. If there is no matrix with this pair of chromosomes, a NoSuchHeatmap error is raised. if scramble is True, then the matrix is scramble in-place after being loaded.

exception hic.NoSuchHeatmap

Bases: Exception