Home | Blog | Software | Publications | GitHub


Circular visualization of DMRs from tagmentation-based WGBS

Tagmentation-based whole-genome bisulfite sequencing (T-WGBS) is a technology which can examine only a minor fraction of methylome of interest. Circular plot can be used to visualize genome-wide distribution of differentially methylation regions (DMRs). In this post, we demonstrate how to visualize DMRs which are detected from T-WGBS data in a circular plot by circlize package.

In tagments_WGBS_DMR.RData, tagments contains regions which are sequenced, DMR1 and DMR2 contain DMRs for two patients detectd in tagment regions. Correspondance between tagment regions and DMRs can be checked by row names of tagments and tagment column in DMR1 or DMR2.

load("../data/tagments_WGBS_DMR.RData")
tagments[1:2, ]
##                         chr    start      end
## chr1-44876009-45016546 chr1 44876009 45016546
## chr1-90460304-90761641 chr1 90460304 90761641
DMR1[1:2, ]
##    chr    start      end methDiff                tagment
## 1 chr1 44894352 44894643    -0.28 chr1-44876009-45016546
## 2 chr1 44902069 44902966    -0.33 chr1-44876009-45016546
DMR2[1:2, ]
##    chr    start      end methDiff                tagment
## 1 chr2 64979958 64980308    -0.22 chr2-64897317-65173964
## 2 chr2 65038093 65039051    -0.21 chr2-64897317-65173964

Chromosomes (e.g. chr1, chr2) and tagments (e.g. chr1-44876009-45016546, chr1-90460304-90761641) are actually different types of categories and circlize can only deal with one type at a time. In order to merge chromosomes and tagment regions into one same plot, the strategy is to create two independent circular plot but overlay the second one directly to the first one by specifying par(new = TRUE).

First we draw the ideograms as well as the chromosome names and we call it 'the first circular plot'.

library(circlize)

circos.par(gap.degree = 2, start.degree = 90)
circos.initializeWithIdeogram(chromosome.index = paste0("chr", 1:22), plotType = c("ideogram", "labels"),
    ideogram.height = 0.03)

plot of chunk unnamed-chunk-2

To make the correspondance between two circular plots, we need mapping of a same genomic position between two circular plots. Here the key solution is the polar coordinate system because all circular plot created by circlize are in a same polar coordinate system. With using circlize() function, we can transform genomic positions in one circular plot into polar coordinates and we can apply reverse.circlize() to get the new genomic positions in the second circualr plot.

# calculate position of each tagment measured in the polar coordinate system
for(i in seq_len(nrow(tagments))) {
    tagments[i, "theta1"] = circlize(tagments[i, 2], 1, sector.index = tagments[i, 1])[1, 1]
    tagments[i, "theta2"] = circlize(tagments[i, 3], 1, sector.index = tagments[i, 1])[1, 1]
}

Since the second circular plot which visualizes DMRs will be put inside the ideograms, we need to know the radius which inside the ideogram. Since the ideogram in the first circular plot is the last track, we can use circlize:::get_most_inside_radius() to get the position.

r = circlize:::get_most_inside_radius()
circos.clear()

Now we make the second circular plot which visualizes DMRs. Here par(new = TRUE) is set to directly overlay to the first plot.

in circos.par(), start.degree can be adjusted to rotate the second circular plot to make better correspondance between two circular plots.

chr_bg_color is defined here to enhance discrimination of different chromosomes.

par(new = TRUE)

set.seed(123)
chr_bg_color = rand_color(22, transparency = 0.8)
names(chr_bg_color) = paste0("chr", 1:22)

circos.par(cell.padding = c(0.02, 0, 0.02, 0), gap.degree = c(rep(1, nrow(tagments)-1), 10), 
    start.degree = 75, points.overflow.warning = FALSE)
circos.initialize(factors = factor(rownames(tagments), levels = rownames(tagments)), 
    xlim = as.matrix(tagments[, 2:3]))

In the first track of the second circular plot, we need to draw kind of connections between two circular plots. Since we know the coordinates of tagment regions in the polar coordinate system, here we use reverse.circlize() to transform back to the corresponding genomic positions (or coordinate in the data coordinate system).

Since the second plot is drawn inside the ideograms, r is set as the outter margin of the second plot.

circos.track(ylim = c(0, 1), panel.fun = function(x, y) {

    si = get.cell.meta.data("sector.index")
    chr = gsub("^(chr.*?)-.*$", "\\1", si)
    theta1 = tagments[si, "theta1"]
    theta2 = tagments[si, "theta2"]

    xlim = get.cell.meta.data("cell.xlim")
    ylim = get.cell.meta.data("cell.ylim")

    cell.top.radius = get.cell.meta.data("cell.top.radius")

    # map from polar coordinate system to data coordinate system
    df = reverse.circlize(c(theta1, theta2), c(cell.top.radius, cell.top.radius))

    x21 = df[1, 1]
    x22 = df[2, 1]
    y21 = df[1, 2]
    y22 = df[2, 2]
    x11 = xlim[1]
    x12 = xlim[2]
    y11 = ylim[1]
    y12 = ylim[1]
    circos.polygon(c(x11, x11, x21, x21, x22, x22, x12, x12, x11),
                   c(y11, (y21 - y11)/3, (y21 - y11)/3*2, y21, y22, (y22 - y12)/3*2, (y22 - y12)/3, y12, y11), 
                   col = chr_bg_color[chr])

}, track.margin = c(0, 1 - r), cell.padding = c(0, 0, 0, 0), bg.border = NA, track.height = 0.1)

plot of chunk unnamed-chunk-4

Now we add DMR tracks. In each cell (plotting region), we added the background with colors, reference lines and points.

max_abs = max(abs(c(DMR1$methDiff, DMR2$methDiff)))
max_abs = ceiling(max_abs*10)/10
circos.track(ylim = c(-max_abs, max_abs), panel.fun = function(x, y) {
    si = get.cell.meta.data("sector.index")
    chr = gsub("^(chr\\d+).*$", "\\1", si)
    xlim = get.cell.meta.data("cell.xlim")
    ylim = get.cell.meta.data("cell.ylim")

    circos.rect(xlim[1], ylim[1], xlim[2], ylim[2], col = chr_bg_color[[chr]])
    for(h in seq(-max_abs, max_abs, by = 0.3)) {
        circos.lines(xlim, c(h, h), lty = 3, col = "#AAAAAA")
    }

    circos.lines(xlim, c(0, 0), lty = 3, col = "#888888")

    subset = DMR1[DMR1$tagment == si, , drop = FALSE]
    if(nrow(subset) > 0) {
        circos.points((subset[[2]] + subset[[3]])/2, subset$methDiff, 
            col = ifelse(subset$methDiff > 0, "#E41A1C", "#377EB8"), pch = 16, cex = 0.5)
    }

}, bg.border = 1, track.height = 0.15)

Also we add y-axis and labels on the left of the first tagment.

first_sector = get.all.sector.index()[1]
circos.yaxis(side = "left", at = seq(-0.6, 0.6, by = 0.3), sector.index = first_sector,
    labels.cex = 0.4)
xlim = get.cell.meta.data("cell.xlim", sector.index = first_sector)
ylim = get.cell.meta.data("cell.ylim", sector.index = first_sector)
circos.text(xlim[1], mean(ylim), "d1", facing = "clockwise", niceFacing = TRUE, cex = 0.8, 
    adj = c(0.5, degree(6)), sector.index = first_sector)

plot of chunk unnamed-chunk-5

Add the DMRs for the second patient.

circos.track(ylim = c(-max_abs, max_abs), panel.fun = function(x, y) {
    si = get.cell.meta.data("sector.index")
    chr = gsub("^(chr\\d+).*$", "\\1", si)
    xlim = get.cell.meta.data("cell.xlim")
    ylim = get.cell.meta.data("cell.ylim")
    circos.rect(xlim[1], ylim[1], xlim[2], ylim[2], col = chr_bg_color[[chr]])
    for(h in seq(-max_abs, max_abs, by = 0.3)) {
        circos.lines(xlim, c(h, h), lty = 3, col = "#AAAAAA")
    }
    circos.lines(xlim, c(0, 0), lty = 3, col = "#888888")
    subset = DMR2[DMR2$tagment == si, , drop = FALSE]
    if(nrow(subset) > 0) {
        circos.points((subset[[2]] + subset[[3]])/2, subset$methDiff, 
            col = ifelse(subset$methDiff > 0, "#E41A1C", "#377EB8"), pch = 16, cex = 0.5)
    }
}, bg.border = 1, track.height = 0.15)
circos.yaxis(side = "left", at = seq(-0.6, 0.6, by = 0.3), sector.index = first_sector,
    labels.cex = 0.4)
xlim = get.cell.meta.data("cell.xlim", sector.index = first_sector)
ylim = get.cell.meta.data("cell.ylim", sector.index = first_sector)
circos.text(xlim[1], mean(ylim), "d2", facing = "clockwise", niceFacing = TRUE, cex = 0.8, 
    adj = c(0.5, degree(6)), sector.index = first_sector)

For the most inside track, we explicitely show the color of chromosomes to make it easier to correspond between target regions to chromosomes.

circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
    cate = get.cell.meta.data("sector.index")
    chr = gsub("^(chr\\d+).*$", "\\1", cate)
    xlim = get.cell.meta.data("cell.xlim")
    ylim = get.cell.meta.data("cell.ylim")
    circos.rect(xlim[1], ylim[1], xlim[2], ylim[2], col = gsub("\\d\\d$", "", chr_bg_color[[chr]]))
}, track.height = 0.02, cell.padding = c(0, 0, 0, 0))

circos.clear()

legend("center", pch = 16, legend = c("Hyper-DMR", "Hypo-DMR"), col = c("#E41A1C", "#377EB8"))

par(new = FALSE)

plot of chunk unnamed-chunk-6