Input/Output
or Conditional/Loops
exercises).
Assign it to the variable cpgi
, using
stringsAsFactors = FALSE
.cpgi
, keeping only CpG islands located on
canonical chromosomes (chr1–chr22, chrX, and chrY). Reassign the result
to cpgi.
cpgi
into a GRanges Object
. Use
the options keep.extra.columns = TRUE
and
ignore.strand = TRUE
.seqnames.field
, start.field
,
end.field
, and strand.field
options.name
column as the names of
the object using names()
, then remove the name
column.TxDb package
related to the
human UCSC hg18 assembly
, then load it and assign it to the
variable genome
. Take some time to explore this
object.genome
object and assign
them to the variable prom
. Extend the TSS by
1000
bp upstream and 100
bp downstream (into
the gene body).prom
.seqlevels
.cpg_prom
.cpgi
object?prom
?Hint: When retrieving the query hits from the
overlap, use the unique()
function to avoid redundant
ranges.
BED
file containing methylation sites
(H3K4me3) in untreated HeLa cells
(H3K4me3_unstim_hg18_xset200_dupsN_ht5.sub.peaks_manipulated.bed
in the Datasets
folder) and assign it to the variable
meth
.cpg_prom
and meth
,
then:Retrieve the subset of cpg_prom
that overlaps with
meth
and assign it to the variable
cpg_prom_Ov
Hint: Use unique()
to retrieve only unique
positions.
Retrieve the subset of meth
that overlaps with
cpg_prom
and assign it to the variable
meth_Ov
.
Hint: Use unique()
to retrieve only unique
positions.
GTF
file for Mouse version M24 located in
the Datasets
folder. Use the makeTxDbFromGFF()
function. (Note: This operation may take some time.)mouse
.columns()
function: it
contains a lot of useful information.mouse
and assign them to the
variable transc
. Use the parameter
columns = c("tx_name", "gene_id")
to include additional
information.all_transcripts
containing all unique
transcript names from transc
.