Research Fellow/Professor  |  Lin, Chung-Yen  
lab (New window)
Software Name:DocMethyl, EpiMOLAS Galaxy Docker for DNA methylation analysis in WGBS
Inventors:Sheng-Yao Su, Shu-Hwa Chen, I-Hsuan Lu, Yih-Shien Chiang, Yu-Bin Wang, Chung-Yen Lin

We have tested with provided demo dataset for Galaxy Docker container under Ubuntu 16.04 64-bit operating system equipped with four-core CPU, 16GB RAM, 400GB of data storage. The estimated elapsed time on single thread mapping mode for demo toy dataset is 10 hours. Around 50 GB size of intermediate files are generated through the workflow. For the efficiency with timesaving, we suggest users perform DocMethyl on Linux servers, private or public clouds. 

It is the first galaxy docker which simplify WGBS data analysis with clicks via web interface for model / non-model organisms.


Software Name:Electronic Lab Notebook on DOCKER
Inventors:Chi-Wei Huang, Shu-Hwa Chen, Chung-Yen Lin

Electronic Laboratory Notebook: ELN Docker Version 2.1.2, PLease find it on

Software Name:myBLAST on DOCKER
Inventors:Linda Lu, Shu-Hwa Chen, Chung-Yen Lin

MyBLAST: A standalone version of NCBI BLAST with Databases/ Results management

To know more about myBLAST and find myBLAST on different platforms, please visit

Software Name:DocExpress: A galaxy docker for estimation of Expression profiling in RNA-seqDoc
Inventors:Ping-Heng Hsieh, Shu-Hwa Chen, I-Hsuan Lu, Chung-Yen Lin

It is a galaxy docker which simplify RNA-seq data analysis with clicks via web interface for model / non-model organisms. 


Software Name:Mysort: A Gene Profiling Deconvolution Approach to Estimating Immune Cell Composition from Complex Tissues
Inventors:Wen-Yu Kuo, Shu-Hwa Chen, Hrong-Hsin Lu, Chung-Yen Lin

An effective cure for cancer is always a dilemma between tumor heterogeneity and the mechanism of avoiding immune destruction. In order to develop advanced treatment, such as immunotherapy and to get more information about the progress of cancer, studying the composition of TILs (Tumor Infiltration Lymphocytes) is the key. In vitro methods, including immunohistochemistry and flow cytometry, which deal with such problem, are always unbiased owing to series of experiments. Gene expression deconvolution is an alternative in silico method aims to analyze relative proportion of the concerned immune cells.

Microarray data shows gene expression profiles of 22 immune cells to construct a signature matrix which play an important role in deconvolution. Non-hematopoietic and cancer cells specific genes are first filtered out before selecting the signature genes. Further, T-test is applied for each gene between immune cells and condition number is used for determining number of genes being selected. Finally, 𝛎-Support Vector Regression was introduced to construct the deconvolution model and to produce relative proportion of immune cells for the mixtures.

PBMCs (Peripheral blood mononuclear cells) mixtures of 20 adults was deconvoluted for relative proportions of concerned immune cells. These samples are also profiled by flow cytometry. 9 immune cells relative proportion among the 22 cell types were used to examine the performance of our deconvolution method. The performance of our result is quite desirable and even better than other deconvolution methods.

The program for deconvolution approach to estimate the proportion of immune cells inside tumor was implemented and wrapped as galaxy plug-in on DOCKER. The DOCKER file will be releasing soon.

Meanwhile, the website of mySort is also ready for public.

Software Name:Cytohubba for cytoscape 3.x
Inventors:Chia-Hao Chin, Po-Wen Chen, Chin-Wen Ho, Shu-Hwa Chen, Ming-Tat Ko, Chung-Yen Lin

Explore important nodes/hubs and fragile motifs in an interactome network by several topological algorithms including Degree, Edge Percolated Component (EPC), Maximum Neighborhood Component (MNC), Density of Maximum Neighborhood Component (DMNC), Maximal Clique Centrality (MCC) and centralities based on shortest paths, such as Bottleneck (BN), EcCentricity, Closeness, Radiality, Betweenness, and Stress.

The statics for download can be found in From Jan, 2017 to July, 2017, more than thousand downloads were from world wide.

Software Name:Electronic Lab notebook 7.x
Inventors:Tima Huang, Shu-Hwa Chen, Chung-Yen Lin

New version of Electronic Lab notebook for Windows/ Mac/ Linux. DOCKER version of ELN is under developing.


Through the ages, scientific records are kept in paper bound laboratory notebook.  Even at the start of the 21st centuries, most scientists still documenting experiments in the paper bound laboratory notebook.  However, the increasing volumes and complexity of the laboratory experiments and emphasis on cooperation between different disciplines have led to the intervention of electronic laboratory notebook (ELN).  ELN enables its users quickly locating the laboratory records, sorting and comparing records, and sharing information.

     One no longer wastes time looking through piles of paper bound laboratory notebooks, when the right information is just one click away in ELN.  ELN increases the sufficiency of work, consequentially, saves money and human resources.  Furthermore, the well-organized electronic records can be easily compared, shorted through and shared. Therefore, increases the possibility for new discoveries and problem solving, which may have significantly increased the company’s competitiveness.  On the other hand, the paper bound laboratory notebook, is harder to share, take longer to find, and organize, consequently, hinder interventions and problem solving.  For example, one may need to share experiment results with a team member who is in another part of the globe, it could be inefficient to fax thousands of paper bound records to its teammate, but with ELN, one could share thousand of records in just few seconds.  Not only that, ELN’s research function can help one search by categories, such as article, imagery and many other related issues, so that one could quickly access to the right information.  Lastly, ELN increase availability of one’s research results.  So that, one is recognized by one’s actual laboratory work not merely by one’s publications.      

     In conclusion, ELN enhances the information technology infrastructure, which in return help supporting interventions, sharing and reusing recorded knowledge, and recognize by one’s actual laboratory work.  


Software Name:epi-MOLAS: Epi-MOLAS: Epi-genoMics OnLine Analysis System
Inventors:Sheng-Yao Su, Shu-Hwa Chen, I-Hsuan Lu, Yih-Shien Chiang, Yu-Bin Wang, Pao-Yang Chen, Chung-Yen Lin

DNA methylation is known as an important regulation of genome function. It involved the DNA duplex superstructure and the affinity to certain DNA binding proteins, resulting to varies of biological effects, such as genomic imprinting, X-chromosome inactivation, ageing, and gene transcription.  It can be a dynamic process for altering gene activity temporary, or a permanently changes upon cell fate determination/commitment, or play roles in epigenetic inheritance. Using bisulfite conversion of genomic DNA combining with next-generation sequencing (BS Seq), we can scan the 5-mehtylcytosine level of all available C residues in the whole genome scale.  To facilitate the access of the BS Seq data for biologists, we build epiMOLAS workbench. Outputs of two popular bisulfite sequence mapping program, BS Seeker (version 2, CGmap) and Bismark (bismark_methylation_extracter), are supported as inputs of epi-MOLAS. First, information of three different 5-methylC sequence contexts (CpG, CHG, CHH) in the position of "promoter" and "gene-body" are calculated. The counting result are joined to gene annotation database to create a data-analysis website. Beside the function of accessing methylation indexes of individual genes, advanced analysis like function enrichment on GO and KEGG, or views the methylation level by heatmap or by whole genome plot, can be performed easily on the built-in quantitative analysis pipeline ("differentially methylated genes" and "methylation profiling") seamlessly. The constructed website can be opened to public access or shared in a keyword-controlled way.

Website URL:

And the pipeline from raw sequences data to estimate the level of methylation were implemented and constructed as workflow in galaxy on DOCKER. All the pipeline will be released soon as the form of docker file in website of docker hub.


Software Name:TEA: The Epigenome platform for Arabidopsis methylome study
Inventors:Sheng-Yao Su, Shu-Hwa Chen, I-Hsuan Lu, Yih-Shien Chiang, Yu-Bin Wang, Pao-Yang Chen, Chung-Yen Lin

Background: Bisulfite sequencing (BS-seq) has become a standard technology to profile the DNA methylation states at single-base resolution. It allows researchers to conduct genome-wise methylation studies of genomic imprinting, transcriptional regulation, cellular development and differentiation. One single data from a BS-Seq experiment is resolved into many features according to the sequence contexts, making methylome data analysis and data visualization a complicate task. 

Results: We developed a streamlined platform, TEA, for analyzing data from whole-genome BS-Seq (WGBS) experiments conducted in model plant Arabidopsis thaliana. To crop the essence of genome methylation status and to meet the efficiency for running online, we introduce a straightforward method for measuring methylation status regarding to the sequence contexts are developed and coded it into a script to process BS-Seq mapping result. Through a simple data uploading process, TEA server deploys a web-based analysis platform by linking methylation level for each gene to an updated Arabidopsis annotation database and toolkits for deep analysis.

Conclusions: TEA is an intuitive and efficient online platform for analyzing Arabidopsis genomic DNA methylation landscape. It provides several ways to help users exploit and discover from their uploaded high-throughput data. It also can facilitate data sharing among cooperators.

TEA is freely accessible for academic users at:

Software Name:EVIDENCE: Genotyping and Recombination Detection of Enterovirus
Inventors:Chieh Hua Lin, Yu Bin Wang, Shu-Hwa Chen, Chao Agnes Hsiung, Chung Yen Lin

Enteroviruses (EV) with different genotypes cause diverse infectious diseases in humans and mammals. A correct EV typing result is crucial for effective medical treatment and disease control; however, the emergence of novel viral strains has impaired the performance of available diagnostic tools. Here, we present a web-based tool, named EVIDENCE (EnteroVirus In DEep conception,, for EV genotyping and recombination detection. We introduce the idea of using mixed–ranking scores to evaluate the fitness of prototypes based on relatedness and on the genome regions of interest. Using phylogenetic methods, the most possible genotype is determined based on the closest neighbor among the selected references. To detect possible recombination events, EVIDENCE calculates the sequence distance and phylogenetic relationship among sequences of all sliding windows scanning over the whole genome. Detected recombination events are plotted in an interactive figure for viewing of fine details. In addition, all EV sequences available in GenBank were collected and revised using the latest classification and nomenclature of EV in EVIDENCE. These sequences are built into the database and are retrieved in an indexed catalog, or can be searched for by keywords or by sequence similarity. EVIDENCE is the first web-based tool containing pipelines for genotyping and recombination detection, with updated, built-in, and complete reference sequences to improve sensitivity and specificity. The use of EVIDENCE can accelerate genotype identification, aiding clinical diagnosis and enhancing our understanding of EV evolution.


Software Name:MOLAS: Multi-Omics onLine Analysis System
Inventors:Shu-Hwa Chen, Linda Lu, Daniel Su and Chung-Yen Lin

Next generation sequencing technologies bring the gene profiling study into big-data science era.  However, the increasing amount of data made itself a problem for viewing data and analyzing the biological implication from it.

Here we present MOLAS, Multi-Omics onLine Analysis System, a robust web service holding gene expression data with build-in annotations and a data analysis toolkit. MOLAS is composed by two parallel server daemons: MOLAS/pilot for project management, and MOLAS/harbor for hosting data retrieving and analysis website. Via an intuitive data loading process, a project is created on MOLAS/pilot. Then the uploaded data is forward to MOLAS/harbor, connected to the build-in annotations and data analysis pipeline, then turned into a website. MOLAS/harbor provides data accessing functions including full-text search, KEGG pathways and module hierarchy view, pairwise libraries comparison, clustering by user-defined scheme, and enrichment analysis in KEGG pathway and GO terms of differentially expressed genes identified by pairwise comparison or by clustering analysis.

Currently, MOLAS accepts gene expression data table in FPKM value, derived from Cufflinks or other tools that map reads to human reference (hg19) or mouse reference (mm10) and indexed in gene symbol. The website derived from a project can be a long-term hosted open-accessible website, or a private, password-controlled site for six months accession.

MOLAS/pilot Site:

Software Name:cytoHubba: Identify Hub Objects and sub-network from Complex Interactome
Inventors:Chia-Hao Chin, Shu-Hwa Chen, Hsin-Hung Wu, Chin-Wen Ho, Ming-Tat Ko, and Chung-Yen Lin


Network is a useful way for presenting many types of biological data including protein-protein interactions, gene regulations, cellular pathways, and signal transductions. We can measure nodes by their network features to infer their importance in the network, and it can help us identify central elements of biological networks.


We introduce a novel Cytoscape plugin cytoHubba for ranking nodes in a network by their network features. CytoHubba provides 11 topological analysis methods including Degree, Edge Percolated Component, Maximum Neighborhood Component, Density of Maximum Neighborhood Component, Maximal Clique Centrality and six centralities (Bottleneck, EcCentricity, Closeness, Radiality, Betweenness, and Stress) based on shortest paths. Among the eleven methods, the new proposed method, MCC, has a better performance on the precision of predicting essential proteins from the yeast PPI network.


CytoHubba provide a user-friendly interface to explore important nodes in biological networks. It computes all eleven methods in one stop shopping way. Besides, researchers are able to combine cytoHubba with and other plugins into a novel analysis scheme. The network and sub-networks caught by this topological analysis strategy will lead to new insights on essential regulatory networks and protein drug targets for experimental biologists. Cytohubba is available as cytoscape plug-in and can be accessed freely at for more detail. According to cytoscape plugin download statistics, the accumulated number of cytohubba is around 6,500 times since 2010.

Software Name:Biomodule
Inventors:Chia-Hao Chin, Shu-Hwa Chen, Chin-Wen Ho, Ming-Tat Ko, and Chung-Yen Lin

BioModule is a web server to explore modules from biological networks. Because a biological module is a set of genes (or proteins) that have similar sets of interaction partners, we can identify modules by detecting highly connected regions in biological networks. An overview of BioModule is shown as follows. When a user submits a network to BioModule, it uses a clustering method, named CAM (Clique Aggregation Method), to detect biological modules; then, it performs GO enrichment analysis and visualizes the results. 


Software Name:Genome-wide DNA methylation on Cloud: Precise mapping for Reduced representation bisulfite sequencing (RRBS) and paired-end reads
Inventors:Pao-Yang Chen, Liudmilla Rubbi, Amit K. Ganguly, Sherin Devaskar, Matteo Pellegrini at UCLA and Team in IIS

Epigenetic regulation, such as cytosine DNA methylation, is important in gene regulation. Precise measurements of genome-wide DNA methylation at single base resolution have only recently become possible with next generation based bisulfite sequencing (BS-seq). However, aligning bisulfite converted reads (single end and paired ends) remains technically challenging. Here we extend our bisulfite aligner, BS Seeker, to accommodate paired end mapping. We further propose RRBS-Seeker for mapping reads that are generated from reduced representation bisulfite sequencing (RRBS). By mapping synthetic RRBS reads against the enzyme-digested fragments, RRBS-Seeker yields a higher mapability and a higher accuracy than mapping the reads directly against the genome. To demonstrate the use of RRBS-Seeker we mapped six mice RRBS lanes. The result reveals lower methylation levels in mice of mothers with caloric restrictions; suggesting the nutrition intake in parents may alter the epigenetic profiles of offspring.

Software Name:PalPALM
Inventors:Lab of Systems Biology and Network Biology & Team led by Prof. Jan-Jan Wu

PalPALM is a standalone program with paralellel computation to do multiple sequence alignment, model selection, bootstrapping and tree viewer. The users can submit their protein or DNA sequences, and the sequences can be either aligned or not aligned. The program automatically searches through all possible models and choose the best model by using AIC criterion. With the best model for each given sequences, users can perform the calculation in Maximum Likelihood (ML) or Bayesian approach to estimate the relationships amid these sequences then reconstruct the phylogenetic tree by simple clicks. 

Software Name:myBLAST: A BLAST Web Service and Standalone Program for Customized Databases and Result Analyzer with Parallel Computing
Inventors:Shu-Hwa Chen, Linda Lu, Ming-Hsin Tsai, Shi-Hai Wong, Chao Hsiung and Chung-Yen Lin

Here we present a web tool named as myBLAST, for building a blast-searchable, customized database, as well as managing databases and blast results. This web platform integrated with original blast programs and several self-developed programs with the mechanism of parallel computing. In the intuitive myBLAST interface, users can upload their own sequence collection in FASTA format, specify the name and the type of the database to create a BLAST-searchable database. After the database has been formatted, the user will receive a notification mail from system, and can start the blast search on the customized databases in web interface. The blast search in myBLAST can take single or multiple sequences at a time via a copy-paste procedure or an uploaded sequence file in fasta format. The results will be presented in a tabular layout for navigation and the raw blast result as well as *.csv file are ready for download. The process run in myBLAST information related with submitted works will be sent by e-mail to notify users to catch the results.

myBLAST is freely accessible at The standalone version for Windows/ MAC with screen cast can be found and downloaded at

Software Name:HUNTER: A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles
Inventors:Chia-Hao Chin, Shu-Hwa Chen, Chin-Wen Ho, Ming-Tat Ko, Chung-Yen Lin

Our proposed HUNTER method has been applied on yeast data, and the empirical results show that our method can accurately identify functional modules. Such useful application derived from our algorithm can reconstruct the biological machinery, identify undiscovered components and decipher common sub-modules inside these complexes like RNA polymerases I, II, III. 

A C++ implementation of our prediction method, dataset and supplementary material are available at

Software Name:Elegance: Electronic Laboratory Notebook
Inventors:Shu-Hwa Chen, Chi-Wei Huang, Tengi Huang and Chung-Yen Lin

For a long time, scientists used pens and glues to record their findings, musing, ideas and inference in paper-bound laboratory notebook. This old fashion is lasted even in the 21st centuries. However, the hand-writing, paper-based recording way is not competent to keep data in increasing volumes and complexity, and is hard to make data sharing in a cooperating project among various disciplines and research communities. With more and more outputs generated with digital deluge, a platform for knowledge repository with the functions like search, backup, reconstruction will be an important issue in current laboratories for daily records.

In our conception, electronic laboratory notebook (ELN) should not only help scientists to put everything as records, but also raise the possibility for new discoveries and problem solving, which may have significantly increased the competitiveness of whole research team. Although there are some ELNs available on market and public domain, the interfaces and prices of these ELNs are not so friendly with shape learning curve. The essential functions inside ELN will be included simple installation with few clicks, note creating with attached experimental digital outputs, full text search with image gallery, succinct user management with digital signature, automatic system backup, calendar with coming event notification, personalized interface with privacy, data sharing and exchange via web, duplication and backup of whole ELN, high availability on function extension, and all the features existed in web 2.0. We have developed the draft of pure web-based ELN (windows/ MAC version) which can be deployed on most available PCs and portable devices instead of high manpower required ELN server /client architecture. Meanwhile, there will be two kinds of robust ELNs released recently; one is group ELN designed for research team as collaboration platform, the other one is portable ELN suitable for personal use as mobile web blog. Currently, we have developed ELN in English, Traditional Chinese and Japanese (windows/ MAC version). Meanwhile, we also got the support from Microsoft Inc. to migrate ELN to Azure cloud for research community. By dissemination in international and domestic conferences, we plan to show our ELN in schedule to research community for revolution of lab notebook.

In brief, we believe the ELN developed by our team will really help research community on supporting interventions, sharing information, re-organizing knowledge, and manifesting actual laboratory works. On the contrary, the feedbacks from users will evoke new developments on IT issues requested by emerging massive experimental results.


Screen casts and prototype of ELN: