4.1.1. make_pairs.py

Make the pairs of genes for a species.

For each pair, fetch the Hi-C values for two bins where the TSS are located. Also check the adjacency of the genes. If wanted, the Hi-C matrices can be scrambled in-memory before use. This feature uses two functions written by Krister SWENSON for the locality program.

The result is a TSV file, optionally Gzipped (if the file name has the extension .gz). The file has no header and the following columns:

  • gene 1 name (string)
  • gene 2 name (string)
  • Hi-C value (64 bits float)
  • Adjacency status (either True or False, case-insensitive)
created:May 2018
last modified:July 2018

4.1.1.1. Usage

Make the pairs of genes for a species.

usage: make_pairs.py [-h] [-N] [-i] [-s] [-v] [--debug] genes hic output

4.1.1.1.1. Positional Arguments

genes the genes, in BED
hic the directory with the Hi-C sparses matrices
output explicit enough; can be gzipped

4.1.1.1.2. Named Arguments

-N, --no-nan

skip the pair if the Hi-C value is Not-a-Number

Default: False

-i, --intra

only look at genes on the same chromosomes; if not set and there is no interchromosomal values, Not-a-Number is used

Default: False

-s, --scramble

Scramble in-memory the Hi-C matrices before use

Default: False

-v, --verbose

be verbose

Default: False

--debug

print debug information

Default: False