In this tutorial we walk through a small example analysis using the MACS peak finding algorithm.

Data used in this tutorial can be found from http://paja.nic.funet.fi/home/akallio/peakROTS/download/peakROTS_tutorial_data_v1.zip. The example dataset is a restricted version of the NRSF_1 dataset. Download, unzip and place the example data files into a subdirectory called data.

Next download and install the peakROTS package. Installation instructions for the peakROTS package can be found from the main page http://www.nic.funet.fi/pub/sci/molbio/peakROTS/#download.

To be able to use the MACS application, you must download MACS version 1.3/1.4 from: http://liulab.dfci.harvard.edu/MACS. Install MACS so that invoking macs or macs14 on the command line starts the application. If environment specific initialisation shell code is needed, it can be given later in the call to initialise function.

Start working in an interactive R session. First the package is loaded with:
library(peakROTS)

For help on using the package, type:
help(peakROTS)

In our example, we examine the difference of two shiftsize options, while keeping other parameters at MACS default value. To initialise the example analysis session, use:

	initialise(
		path.work = "work",
		data.path = "data",
		treatment.file = "treatmentdata_example.dat",
		control.file = "controldata_example.dat",
		shiftsize = list(10, 100),
		tsize = list(25),
		bw = list(300),
		nolambda = list(FALSE),
		pvalue = 10^-2,
		bootstrap.count = 10,
		r.command = "R"
	)

Note! You must set the parameter r.command to match the R executable under which you installed the peakROTS package.

The initialisation will create a subdirectory called work and fill it with required runtime data. Of interest are especially files pending-jobs.txt, running-jobs.txt and finished-jobs.txt. Each job has a single line in one of the files and all jobs go through the files as their states progress from pending to running and from running to finished.

Once the analysis has been initialised, processing can be started:

	run(
		path.work = "work", 
		do.run = do.run.local,
		jobs.running.max = 10
	)

The call will start to process jobs locally in 10 separate processes. The parameter do.run controls how jobs are run (in separate processes or via batch processing system).

It is possible to stop the master node by typing Ctrl+C. Jobs will continue to run, so either you have to wait for them to finish or kill them using operating system's tools. If the analysis is continued in a new R session, you must first load the settings from the work directory:
load.settings(path.work = "work")

Now, run(...) can be called again to continue running the jobs. Different do.run implementation from the original can be given.

After all the jobs have been finished, the master node stops and the call to run(...) returns. Results can be viewed from work/results.

	load("work/result/result.Rdata")
	print(repro.summary)
	print(cat("ROTS selected the parameter combination: ", 	result))

Peak detection results on the original data can be found from work/peaks, following the naming scheme Original_<PARAMS>, where <PARAMS> is the same parameter string that is also reported in the results.

 

Back to main page...