4.1.7. norm_center.py

This script normalizes multiple Hi-C datasets so they can be compared together. This is done in two steps, each one being done independently on each dataset.

The first step consists of the following process:

  1. All matrices are read into memory.
  2. The mean and standard deviation are computed.
  3. For each box of the matrices, the mean is subtracted from the box, then the box is divided by the standard deviation.
  4. The normalized matrices are written.

The second step is simple: we just add to each box in each matrix from each dataset the minimum value over all dataset.

created:August 2018
last modified:August 2018

4.1.7.1. Usage

This script normalizes multiple Hi-C datasets so they can be compared together. This is done by first subtracting the mean from each box, then by dividing each one by the std. dev. and finally by adding the last value over all dataset. For more details, see the head of that file. The normalized datasets are named after the original ones suffixed with _centernorm

usage: norm_center [-h] [-t THREADS] [-v] [--debug] datasets [datasets ...]

4.1.7.1.1. Positional Arguments

datasets the original datasets

4.1.7.1.2. Named Arguments

-t, --threads

the number of threads to use; high values use more memory

Default: 1

-v, --verbose

be verbose

Default: False

--debug

print debug information

Default: False