GenomicRanges package (Genomic ranges)

  1. Read the file Arabidopsis_thaliana_TSSs.rds, located in the Datasets folder, and assign it to a variable named tss.
    This file contains a GenomicRanges object representing the transcription start site (TSS) coordinates of Arabidopsis thaliana transcripts.

  1. Which chromosomes are present in Arabidopsis thaliana?

  1. Subset tss, keeping only sequences from sequence 1, and reassign the result to the variable tss.

  1. Extend the ranges: we want each range to be 1050 bases wide, but only by extending the start position.
    Hint: Pay attention to which position should remain fixed. Reassign the result to the variable tss.

  1. Shift the ranges by 50 bases and reassign the result to the variable tss.

  1. Merge overlapping ranges while considering strand information. Reassign the result to the variable tss.

  1. Import the BED file Arabidopsis_casual_regions.bed, located in the Datasets folder.

  1. Find overlaps between the extended and shifted regions obtained earlier and the intervals in Arabidopsis_casual_regions.bed. Then:

  1. Install the BSgenome Object for Arabidopsis thaliana (use BSgenome.Athaliana.TAIR.TAIR9).
    Create a GenomicRanges Object with the following values:

Extract the base sequences of the genomic intervals contained in this GenomicRanges object and assign them to the variable sequences.What type of object do you obtain?


  1. Compute the reverse complement of the genomic sequences stored in the variable sequences.

  1. Create a GenomicRanges Object using the following coordinates:

What do you obtain? Why?


  1. Create a GenomicRanges object using the following coordinates:

  1. Add strand information to the GenomicRanges Object, choosing among the allowed symbols.

  1. Add names to the intervals: Gene_1, Gene_2, and Gene_3.