Access gene annotation using gffutils

Recently, I had to access gene annotations in multiple versions from multiple sources such as Ensembl, GENCODE, and UCSC. I used to rely on …

Read UniProtKB in XML format

UniProt Knowledge Base (UniProtKB) provides various methods to access their data. I settled on their XML format since no additional parsing code is required …

Ad hoc bioinformatic analysis in database

Recently I’ve found that bioinformatic analysis in a database is not hard at all and the database set up wasn’t as daunting …

Using EnsDb's annotation database in Python

How to find and download the EnsDb, the Ensembl genomic annotation in SQLite database made by R package ensembldb, and use it in Python application.

Use Snakemake on Google cloud

TL;DR Run a RNA-seq pipeline using Snakemake locally and later port it to Google Cloud. Snakemake can parallelize jobs of a pipeline and …

Variants、eQTL、MPRA

本文內容主要來自 Barak Cohen 教授給的數堂課的筆記,以 Systems Biology 的角度來看 coding/noncoding variant modeling 和相關實驗 MPRA。

Ensembl Genomic Reference in Bioconductor

Using fundamental R/Biocondcutor packages (e.g. AnnotationHub, ensembldb and biomaRt) to query Ensembl genomic references or annotations.

Plot Sequencing Depth with Gviz

TL;DR Plot exome sequencing depth and coverage with genome annotation using Gviz in R. Then apply detail control on Gviz annotation track displaying …

Overview of Genomic Data Processing in Bioconductor

Notes of fundamental tools and learning resources for handling genomic data in R with Bioconductor.

FASTA/Q sequence processing toolkit -- seqtk

This post demonstrates the FASTQ to FASTA conversion and sequence quality check using seqtk.