4.2.2. genes

This module contains functions that parses BED files containing genes and compute the adjacency of genes.

A gene is a dict with the following keys:

  • chrom, a str
  • start, an int
  • end, an int
  • name, a str
  • strand, a str, but can only be + or -

In this file, the genes argument refers to a list of such dicts.

created:May 2018
last modified:September 2018
genes.compute_adjacent(genes)

Look for the adjacent genes of all genes and return a mapping of a gene name to its left and right neighbours. The strand is not taken into account for this.

Warning

genes is expected to be sorted.

genes.make_bed(genes)

Convert the genes to BED. Return the resulting string.

Note

The score column is set to 0.

genes.read_bed(name)

Read the BED file name, and return a list of dict. Each dict has the following keys: chrom, start, end, name, strand. The list is sorted.

Note

The BED file is loaded into memory for faster processing.

genes.write_bed(genes, name)

Write the genes to the BED file name.