Rnmr1D Example /

Filename Size Date modified Message
input
output/ADG1
15.3 KB
71.6 KB
13.1 KB
448.8 KB

logo

A detailed example

This example is based on a subset of NMR spectra (18) of human urine samples from the MetaboLight MTBLS1 study (http://www.ebi.ac.uk/metabolights/MTBLS1). Then this subset of spectra have been converted into the nmrML format (http://nmrml.org/) using the online converter (http://nmrml.org/converter/). The converted spectra are stored within the input/ADG1_nmrML directory.

To process this subset of spectra in the same way as the full dataset, we have to use the same macro-command file. All macro-commands have been obtained (i.e. interactively generated then exported) from NMRProcFlow web application (v1.2.6, See NMRProcFlow Processing methods) except those regarding the bucketing. Indeed, the web application cannot yet manage this type of macro-commands through its interface (TODO). So macro-commands regarding the bucketing have been manually added in the macro-command file.

input/MTBLS1/NP_macro_cmd_ADG1_nmrML.txt

#%% Vendor=nmrml; Type=fid; LB=0.3; BLPHC=TRUE; ZF=4; PHC1=TRUE; FP=0; ZNEG=FALSE; TSP=TRUE; 

# Global Baseline Correction: PPM Range = (  -0.499979640386268  ,  10.9998612451513  )
gbaseline 10.2 10.5 -0.499979640386268 10.9998612451513 75 52.5 

# Baseline Correction: PPM Range = (  2.963 , 4.72  )
airpls 2.963 4.72 1 

# Baseline Correction: PPM Range = (  0.5 , 2.963  )
airpls 0.5 2.963 1 

# Baseline Correction: PPM Range = (  6.338  ,  9.542  )
airpls 6.338 9.542 2 

# Baseline Correction: PPM Range = (  4.865  ,  6.207  )
airpls 4.865 6.207 4 

# Baseline Correction: PPM Range = (  0.674  ,  1.601  )
airpls 0.674 1.601 3 

# Alignment of the selected zones ( 6.233 , 9.697 )
clupa 10.2 10.5 6.233 9.697 0.01 5 0

# Alignment of the selected zones ( 4.854 , 5.703 )
clupa 10.2 10.5 4.854 5.703 0.01 5 0

# Alignment of the selected zones ( 4.241 , 4.687 )
clupa 10.2 10.5 4.241 4.687 0.01 5 0

# Alignment of the selected zones ( 2.967 , 4.244 )
clupa 10.2 10.5 2.967 4.244 0.02 5 0

# Alignment of the selected zones ( 1.616 , 2.968 )
clupa 10.2 10.5 1.616 2.968 0.03 5 0

# Alignment of the selected zones ( 0.684 , 1.592 )
clupa 10.2 10.5 0.684 1.592 0.01 5 0

#
# Zeroing the selected zones ...
#
zero
4.705 4.881
5.484 6.333
EOL

# AIBIN Bucketing - Resolution = 0.5 - SNR = 5 - Append Buckets = No
bucket aibin 10.2 10.5 0.5 5 0
0.6 4.72
4.865 5.5
6.3 9.5
EOL

We now launch the whole sequence of macro-commands.Firstly, the preprocessing step is applied. The term pre-processing designates here the transformation of the NMR spectrum from time domain to frequency domain, including the phase correction and the fast fourier-transform (FFT). (See NMRProcFlow Pre-processing). Then, the rest of the sequence concerning the spectra processing (from baseline correction up to the alignment step and ending by the bucketing step), is also applied.

# Delete previous results
rm -f ./examples/output/ADG1/*

# Launch Rnmr1D to process a NMR spectra set 
docker run -i --rm -v `pwd`/examples:/data nmrprocflow/rnmr1d --debug \
   --input /data/input/ADG1_nmrML \
   --proccmd /data/input/MTBLS1/NP_macro_cmd_ADG1_nmrML.txt \
   --outnorm PQN \
   --cpu 4 \
   --outdir /data/output/ADG1 | tee ./examples/output/ADG1/stdout.log

Note: It is important to note that in the command shown above, the 'examples' directory is mounted to "/data" within the docker container. It means that the 'examples' directory is seen by the docker container as "/data" in its own filesystem. This is why we specify "/data" as the root of all input/output files in the arguments because all commands will be internally executed within the docker container.

The main resulting files :

  • data_matrix.txt : data matrix (Tabular Separator Value)

  • bucket_out.txt : bucket list (Tabular Separator Value)

  • specs.pack : Processed NMR spectra (binary format)

Visualize / explore the processed NMR spectra using the NMR viewer

  • First, launch the NMR viewer by creating the corresponding docker container
docker run -d -v `pwd`/examples/output:/opt/data -p 8081:80  docker.io/nmrprocflow/nmrview

Visualize / explore the processed NMR spectra using R

  • The resulting NMR spectra are stored in binary format (pack). The binary format has the advantage to reduce the size on the disk space, but above all drastically reduces the loading time (in an R script for instance) due to the fact that text conversion to binary is time consuming especially when there are million of data. Fortunately, R can read binary files with the help of the 'readBin' command.

  • Here is the R script to read, then plot spectra (stacked )

packfile <- "<full path>/examples/output/ADG1/specs.pack"

#-------- Begin: Read Pack File ------------
to.read = file(packfile,"rb")

# File Header: 20 bytes = 2x8 (double) + 2x4 (integer) 
ppm_range <- readBin(to.read, what="double",size=8, n=2, endian = "little")
data_dim <- readBin(to.read, what="integer",size=4, n=2, endian = "little")

# Read data
specdata<-readBin(to.read, what="double",size=8, n=prod(data_dim), endian = "little")
specmat <- t(matrix(specdata, nrow=data_dim[2], ncol=data_dim[1], byrow=TRUE)[,2:(data_dim[1]-1)])
ppm <- seq(from=ppm_range[2], to=ppm_range[1], by=(ppm_range[1]-ppm_range[2])/(dim(specmat)[2]-1))

close(to.read)

#-------- End: Read Pack File ------------

cols <- rainbow(dim(specmat)[1], s=0.8, v=0.75)

# Stacked Plot
Ymax <- 5e+12          # intensity limit
ppm_lim <- ppm_range   # c( ppm min, ppm max )
K <- 0.67              # Graphical height of the stack (0 .. 1)

plot( cbind(ppm, specmat[1,]), xlim=rev(ppm_lim), ylim=c(0,Ymax), type="l", col="blue")
for( i in 2:dim(specmat)[1] )
   lines(cbind(ppm, specmat[i,] + (i-1)*K*Ymax/dim(specmat)[1]), col=cols[i])

output/ADG1/stdout.log

Rnmr1D:
Rnmr1D:   Version: 1.2.22
Rnmr1D:
Rnmr1D:  Unzip the ZIP file ...
Rnmr1D:  --- READING and CONVERTING ---
Rnmr1D:  Generate the 'samples' & 'factors' files from the list of raw spectra
[4/18]: ADG10003u_010_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.240471
OK

[1/18]: ADG10003u_007_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.239402
OK

[2/18]: ADG10003u_008_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.240471
OK

[3/18]: ADG10003u_009_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.243219
OK

[8/18]: ADG10003u_022_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.240624
OK

[5/18]: ADG10003u_015_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.243067
OK

[6/18]: ADG10003u_016_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.241693
OK

[7/18]: ADG10003u_017_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.242151
OK

[9/18]: ADG19007u_011_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.243830
OK

[12/18]: ADG19007u_014_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.241693
OK

[11/18]: ADG19007u_013_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.242303
OK

[10/18]: ADG19007u_012_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.246731
OK

[16/18]: ADG19007u_071_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.240929
OK

[13/18]: ADG19007u_015_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.241693
OK

[15/18]: ADG19007u_017_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.248563
OK

[14/18]: ADG19007u_016_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.241693
OK

[17/18]: ADG19007u_072_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.242303
OK

[18/18]: ADG19007u_073_10.nmrML
-----
Read the FID ...OK
Preprocessing ...
        Exp. Line Broadening (LB=0.300000)
        TD = 32768
        Zero Filling (x4)
        SI = 131072
        Applied GRPDLY ...OK
        FFT ...OK
OK
Optimizing the zero order phase ...OK
Optimizing the first order phase ...OK
PPM calibration based on TSP  ... PPM min =-5.241387
OK

Rnmr1D:  Generate the final matrix of spectra...
Rnmr1D:
Rnmr1D:  Write the spec.pack file ...
Rnmr1D:  Write the list_pars.txt file ...
Rnmr1D:
   user  system elapsed
 66.682   0.537  27.905
Rnmr1D: ------------------------------------
Rnmr1D: Process the Macro-commands file
Rnmr1D: ------------------------------------
Rnmr1D:
Rnmr1D:  Baseline Correction: PPM Range = ( 2.963 , 4.72 )
Rnmr1D:     Type=airPLS, lambda= 1
Rnmr1D:  Baseline Correction: PPM Range = ( 0.5 , 2.963 )
Rnmr1D:     Type=airPLS, lambda= 1
Rnmr1D:  Baseline Correction: PPM Range = ( 6.338 , 9.542 )
Rnmr1D:     Type=airPLS, lambda= 2
Rnmr1D:  Baseline Correction: PPM Range = ( 4.865 , 6.207 )
Rnmr1D:     Type=airPLS, lambda= 4
Rnmr1D:  Baseline Correction: PPM Range = ( 0.674 , 1.601 )
Rnmr1D:     Type=airPLS, lambda= 3
Rnmr1D:  Alignment: PPM Range = ( 6.233 , 9.697 )
Rnmr1D:     CluPA - Resolution =0.01 - SNR threshold=5 - Reference=0
Rnmr1D:  Alignment: PPM Range = ( 4.854 , 5.703 )
Rnmr1D:     CluPA - Resolution =0.01 - SNR threshold=5 - Reference=0
Rnmr1D:  Alignment: PPM Range = ( 4.241 , 4.687 )
Rnmr1D:     CluPA - Resolution =0.01 - SNR threshold=5 - Reference=0
Rnmr1D:  Alignment: PPM Range = ( 2.967 , 4.244 )
Rnmr1D:     CluPA - Resolution =0.02 - SNR threshold=5 - Reference=0
Rnmr1D:  Alignment: PPM Range = ( 1.616 , 2.968 )
Rnmr1D:     CluPA - Resolution =0.03 - SNR threshold=5 - Reference=0
Rnmr1D:  Alignment: PPM Range = ( 0.684 , 1.592 )
Rnmr1D:     CluPA - Resolution =0.01 - SNR threshold=5 - Reference=0
Rnmr1D:  Bucketing the selected PPM ranges ...
Rnmr1D:     AIBIN - Resolution =0.5 - SNR threshold=5 - Append=0
Rnmr1D:     Zone 2 = ( 4.865 , 5.5 ), Nb Buckets = 35
Rnmr1D:     Zone 3 = ( 6.3 , 9.5 ), Nb Buckets = 188
Rnmr1D:     Zone 1 = ( 0.6 , 4.72 ), Nb Buckets = 508
Rnmr1D:     Total Buckets = 675
Rnmr1D:
Rnmr1D:  Write the spec.pack file ...
Rnmr1D:
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1408868 75.3    2164898 115.7  2164898 115.7
Vcells 1447481 11.1    6979154  53.3 11289541  86.2
   user  system elapsed
182.921   7.347  50.482
Rnmr1D: ------------------------------------
Rnmr1D: Process the file of buckets
Rnmr1D: ------------------------------------
Rnmr1D:
Rnmr1D: NB Buckets = 675
Rnmr1D:
Rnmr1D:
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1409863 75.3    2164898 115.7  2164898 115.7
Vcells 1451737 11.1    4466658  34.1 11289541  86.2
   user  system elapsed
  0.964   0.012   0.975