Latest Version .jar File

View on GitHub

User Manual

Get PiiLer

How to?

Run PiiL

Open a KGML file

Load methylation data

Select a subset of CpG sites

Load expression data

Load samples information

Navigate through samples

Right-click menu options

Multiple-sample view

Group-wise view

Generate a PiiLgrid

Highlight a set of genes in a pathway

Export pathway to image

Export list of matched genes in each pathway

Duplicate the current pathway in a new tab

Cite us

Decode the name "PiiL"

Run PiiL

Get the latest version of PiiL, together with some data for testing, from here. Unzip the downloaded file. To run PiiL on Mac just double-click on the .jar file. For Windows and Linux platforms, open a terminal, go to the directory that you have the .jar file and run "java -Xmx4096m -jar PiiL-vx.xx.jar".

back to top

Open a KGML file

There are two options to open a KGML file:

open KGML

1) Open it by selecting the pathway and organism name from the KEGG database. This will download the file and for the next time will use the downloaded file.
2) Open an already downloaded KGML file from your hard drive.
By typing in the boxes labeled as 'Search', the comboboxes list only the pathways / organisms that contain the string typed in the box.

open from web

back to top

Load methylation data

After opening a pathway, you can load your meta-data over that pathway. The CpG IDs - if the data comes from a methylation array - or CpG positions -if the data is from whole genome bisulphite sequencing - need to be replaced with their annotated gene name. Any other information (for example the CpG ID or its genomic region) can come after an underscore in the first column. A script to convert the data into this format is available here. The first row lists the samples IDs. The beta values must be between 0 and 1.
A valid input file looks like the following (the column separator is selectable):

sampleIDS	sample1	sample2	...	sampleN
TLR2[_cg00000884_chr4:154609857_intronic]	beta_value	beta_value	...	beta_value
ELOVL1[_cg00001446_chr1:43831041_exonic]	beta_value	beta_value	...	beta_value
...	...	...	...	...
ROCK2[_cg00001594_chr2:11484705_UTR5]	beta_value	beta_value	...	beta_value

To load the methylation data, click on New under Load -> Methylation. Once a methylation data file is loaded, it will be listed under the Load -> Methylation menu to be used for other pathways directly.
Customization of the input file is possible via a window that emerges after selecting a file. If the file has been converted with PiiLer you can just press OK.

methylation input

The genes in the input file that match with the genes in the pathway are colored according to their beta values. Blue color codes the hypo-methylation status (the closer the beta value is to zero the color is darker blue) and red color codes the hyper-methylation status (the closer the beta value is to 1 the color is darker red).
The samples IDs are listed in the first combobox and the matched genes are listed in the second combobox where the number of matches is shown by the label below it.

load methylation

back to top

Select a subset of CpG sites

PiiL provides four options to select a subset of all CpG sites that hit each gene, available at Tools -> Select a subset of CpG sites. When these subselection is applied, the genes with no CpG passing the criteria are colored in light gray.

1) Selecting sites by standard deviation filtering: This option aims to filter out the CpG sites that have very little variation over all samples according to an adjustable threshold for the standard deviation of the beta values for a site over all samples. This option facilitates the visibility of the sites that differ significantly between the samples; where there are multiple CpG sites hitting a gene that do not vary significantly between the samples and averaging them for color-coding a gene the strong signal is weakened.

2) Selecting sites by genomic region name: If the CpG sites are annotated with a genomic region, the user can select specific sites, for example the ones that are intronic, exonic, UTR5 and so on.

3) Selecting sites by their beta values: includes the sites based on user defined ranges for beta-values.

4) Loading a list of pre-selected sites: a list of pre-selected CpG sites can be loaded and the sites that do not exist on the list will be excluded.

back to top

Load expression data

Genes and their expression values (FPKM) are the second type of meta-data that you can load on your pathway of interest. Make sure the input file has the sample IDs on the first row and gene names on the first column. For each gene, the expression values appear on its following columns.
A valid input file looks like the following (the column separator is selectable):

sampleIDS	sample1	sample2	...	sampleN
GeneA	expression_value	expression_value	...	expression_value
GeneB	expression_value	expression_value	...	expression_value
...	...	...	...	...
GeneX	expression_value	expression_value	...	expression_value

To load the gene expression data, click on New under Load -> Gene Expression. Once a gene expression data file is loaded, it will be listed under the Load -> Gene expression menu to be used for other pathways directly.
The genes in the input file that match with the genes in the pathway are color-coded according to the k-fold difference between their expression level and the median expression of all the samples. The samples IDs are listed in the first combobox and the matched genes are listed in the second combobox where the number of matches is shown by the label below it.

load expression

The k-fold difference value is adjustable via Tools -> Manage color-coding and its default value is 4. Blue color codes the over-expressed status and red color codes the under-expressed status.

range manager

range manager

back to top

Load samples information

If the samples in the loaded meta-data files have additional information like gender, age, ethnicity, disease stage and so on, you can load the file with matched sample IDs and the additional data will be shown for each sample.
Samples information can be loaded from Load -> Samples information menu, by choosing New or a previously loaded samples information file. In the form that shows up after selecting the input file, you can change the default choices for the 'columns separator' and the column containing sample IDs.
If the samples are from TCGA data sets, check the related checkbox for special matching according to TCGA sample barcodes.

sample info

If a valid file is loaded and the samples IDs in the input file match with the ones loaded on the pathway, the additional information for each sample is displayed while navigating through samples. You can select which fields to be shown by clicking on the 'Edit Fields' button (in the top right corner).

sample info example

sample info example

back to top

Navigate through samples

When the meta data is loaded you can navigate through samples using the buttons on the left side panel to see the status of the genes in the pathway for each sample. The 'Play' button shows the samples consecutively. If the 'Repeat' checkbox is checked the playback is repeated from the beginning when the last sample is shown. The speed of playback can be set using the slider. A short explanation (tooltip) is shown when hovering and staying over side panel items with the mouse.

navigate

back to top

Right-click menu options

A menu is available for all the genes when you right-click on the gene rectangle, with the following options:

1) Check this gene on ... : The first option is enabled for all the genes. This opens a tab in the default browser and checks the target gene's information on your selected database: GeneCards, Pubmed or Ensemble.

right click menu

2) Histogram for all samples: The second option makes a histogram for the beta-values or expression values of all samples for the targeted gene.

histogram

3) Barplot for raw data is another representation of beta-values or expression values for all samples.

barplot

4) Show CpG site(s) details: this option is available when DNA methylation data has been loaded. A gene with a green-colored border show that multiple CpG sites have been annotated to this gene and the color is the average of all CpG sites. If the genomic region is provided in the input data (genomic region can be attached to the gene name after an underscore), you can right click on the gene and all CpG sites for this gene - labeled with the genomic region tag - shows up on the screen. Through this feature the user can navigate through the samples and check the status of each CpG site rather than the color coded as an average of all the sites. In addition to the standard deviation filtering option, the user can also select/deselect specific sites to be included/excluded in the analysis. The sites with no valid or not available (NA) beta values are shown in gray. By hovering the mouse over each CpG site, its beta values will be shown.

genomic regions

genomic regions

5) Multiple-sample view: this is explained in this section.

6) Find genes with similar pattern: using this option, the methylation or expression pattern of the samples for the selected genes are compared with the other genes in the pathway, or all the genes in the loaded metadata file. The Euclidean distance between the value of each sample for targeted gene and other genes is calculated and the most similar or dissimilar ones are listed. The results can be saved as a file to be used later.
When choosing "The whole loaded metadata", it is possible to find some genes that are not part of any KEGG pathway, in this case "Generate PiiLgrid" option can be used to create a grid of genes with no connection between them. PiiLgrid is explained in this section.

similar genes

similar genes

7) sort samples: the final option sorts the samples according to the methylation or expression values of all the samples for the selected gene. The sorting can be done in an ascending or descending order. This option can be used for the playback option of while navigating through the samples.

back to top

Group-wise view

Having loaded the samples information file, the values of methylation/expression of each gene can be shown in a group-wise manner. Samples can be grouped by one of the columns of the samples information file. To see the pathway in Group-wise view select View -> Group-wise view for all/selected genes. To go back to Single-sample view, select View -> Show single-sample view for all genes.

grouping form

The grouping shows the average methylation/expression over the samples belonging to each group. The samples grouping is summarized with the name of each group and number of samples in that group in the parenthesis.
For the expression data, there are two options:
1) comparing the average expression value of the samples of each group to the average of all samples.
2) comparing the average expression value of the samples of each group, with the average of the samples belonging to a group chosen as the 'base group'.

group-wise view

In the group-wise view mode, barplot and histogram are also shown with the selected grouping.

barplot grp

barplot grp

histogram
grp

back to top

Multiple-sample view

This function visualizes the metadata for the current sample and the next 10 samples, for the selected or all (not recommended because the pathway will look very crowded) genes. If the number of samples from the current one to the end of the sample set is less than 10, as many as possible are displayed.
To select a gene that is color-coded (i.e. has some metadata loaded), double click on the gene rectangle. The selected genes' borders turn to yellow.
Hint: To choose the direction of expansion for multiple-sample view, double-click on your desired quarter of the gene rectangle while selecting it (image a gene rectangle is divided to four equal rectangles). For example double clicking on the north-west quarter of the gene sets it to show next samples on the right top side of it.

If you hover the mouse over the extended rectangles, a tooltip shows the sample that rectangle is representing.
You can right-click on any of the expanded genes and choose Single-sample view to undo the expansion. To go back to Single-sample view for all the expanded genes (the ones in multiple-sample view), select Tools -> Show single-sample view for all genes.

back to top

Generate a PiiLgrid

Sometimes most of genes we are interested in are not part of any of the KEGG pathways. In this scenario, PiiL provides a grid of not connected genes called PiiLgrid where all the functionalities of PiiL are applicable. A PiiLgrid can be generated either via File -> Generate a PiiLgrid or out of selected genes listed in the "Find genes with similar pattern" form.
In the first case, the input file is a text file with the name of each gene on each row.

piilgrid

back to top

Highlight a list of genes in a targeted pathway

If you have a list of genes and want to check the overlap of this list with a pathway of interest, first open the pathway and then select Load -> List of genes to load your file. The genes that exist in the targeted pathway will be highlighted with a red text and red border.

highlight genes

back to top

Export pathway to image

The visualized pathway and its color-coded meta data can be exported to a vector based image (and other formats). This function is available under the Export -> Current pathway to image menu with two options:
1) The visible part of the pathway
2) The entire pathway

back to top

Export list of matched genes in each pathway

After openning multipe pathways in multiple tabs and loading meta data for each of them, this function can be used to get a summary of each tab containing the pathway, the meta data loaded over that and the list of genes that were matched in each pathway. By clicking on Export -> List of matched genes in each pathway you can choose a file that keeps this summary.

back to top

Duplicate the current pathway in a new tab

Having an open tab with a visualized pathway and meta data loaded, you can choose the Tools -> Duplicate the current pathway option to quickly duplicate the current tab's pathway (without the metadata) in a new tab.

back to top

Cite us

If you PiiL your data please cite our publication Moghadam, Behrooz Torabi, et al. "PiiL: visualization of DNA methylation and gene expression data in gene pathways." BMC genomics 18.1 (2017): 571.

back to top

Decode the name "PiiL"

Look at the logo carefully. Do you recognize a cute elephant there?! Piil (pronounced as meal) means elephant in Persian (with Sanskrit origin). And on a chess board the chessman called bishop is actually a piil (elephant) standing next to a hosrse (knight)!

back to top