Effects of chronic, long-term heat stress on gene expression in tomato pollen
This source code repository houses data analysis source code and processed data files from a heat stress RNA-Seq experiment on tomato pollen performed in the laboratory of Nurit Firon. Sequencing was sponsored by the Pollen Research Coordination Network to provide a resource for pollen biologists and also serve as an example data set for the 2014 UNC Charlotte Workshop in Next-Generation Science.
Sequence data are available from the Short Read Archive under accession SRP055068.
If you use the data or results from this repository in your research, please cite:
Loraine AE, Blakley IC, Jagadeesan S, Harper J, Miller G, Firon N. Analysis and Visualization of RNA-Seq Expression Data Using RStudio, Bioconductor, and Integrated Genome Browser. Methods Mol Biol. 2015;1284:481-501. doi: 10.1007/978-1-4939-2444-8_24. PubMed PMID: 25757788.
Heat-tolerant cultivar Hazera 3042 tomato plants were grown in two temperature-controlled greenhouses. One greenhouse (control) was configured for optimal grown with daytime temperature of 25 deg C and nighttime temperature of 18 deg C. The other greenhouse (treatment) was set to 32 deg daytime temperature and 26 deg C night-time temperature.
Several times during Sept and October of 2013, pollen samples were collected from plants in each greenhouse. Each collection consisted of one batch of pollen harvested from several plants in the control group and another batch harvested from several plants in the treatment group. In addition, measurements of pollen viability, germination, and number of pollen per flower were made on pollen collected in each batch. The results of these are in a PowerPoint presentation in this repository.
RNA was extracted from each batch and sent to the laboratory of Jeff Harper, who forwarded the samples to a sequencing facility at UCLA, which made libraries for Illumina sequencing from five treatment samples and five control samples. The ten libraries were combined into a single lane for sequencing on a HiSeq instrument. Paired-end sequencing was done for 69 cycles per end.
Data processing and differential expression analysis
Sequence reads were aligned onto version 2.5 of the Solanum lycopersicum genome assembly using the tophat2 spliced alignment tool. Alignments of cDNA fragments overlapping annotated ITAG2.4 genes were counted using featureCounts. Only fragments (reads) that aligned to a single gene and a single location in the genome were counted. Differential expression of tomato genes under the heat stress was detected using the edgeR library from Bioconductor. Gene Ontology enrichment was done using the GOSeq library from Bioconductor and Gene Ontology annotations provided by the Sol Genomics Network.
Alignments of reads and junction features deduced from splice read alignments can be viewed using Integrated Genome Browser, which is freely available from BioViz.org.
To view the data in IGB:
- Get a copy of IGB (http://www.bioviz.org)
- Under the Current Genome tab, select species S. lycopersicum and the Feb. 2014 (2.5) genome version. Or click the tomato image (left of the Mona Lisa image).
- The tomato pollen data are available in the Data Access tab in the folder named Pollen
CuffLinks gene models
Read alignments for reads that mapped exactly once to the genome were combined into one large file that contained both treatment and control reads and provided as input to Cufflinks. The resulting GTF file was converted to BED-detail format and will be made available here and in the IGB QuickLoad server for visualization.
About folders in this repository
Folders contained in this repository represent data analysis modules that are mostly independent but sometimes use files and results from other modules.
This module compares genes expressed in tomato pollen to genes expressed in Arabidopsis pollen. Information about Arabidopsis gene expression is from Loraine (2013) RNA-Seq of Arabidopsis pollen uncovers novel transcription and alternative splicing. It depends on results files from folder DifferentialExpression.
This module processes and evaluates output from featureCounts, which was used to count the number of fragments overlapping tomato genes annotated as part of ITAG2.4. The major output of this module is a file listing tomato genes and the number of counts per gene per treatment and control sample libraries.
This module uses counts data and the edgeR Bioconductor library to identify differentially expressed genes and create files suitable for manual analysis using the LycoCyc Cellular Overivew metabolic pathways visualization tool and other programs. It depends on code in the Counts module having been run.
This folder contains documentation describing experimental conditions and other information critical to understanding and interpreting the data analysis results.
Contains data files downloaded from other Web sites or generated in upstream bioinformatics data processing steps.
This module describes using GOSeq to identify Gene Ontology categories with unusually many differentially expressed genes. It depends on the DifferentialExpression module.
This module contains code used to generate gene regions file used by featureCounts in data processing. To generate gene regions, it reads and processes a BED file with ITAG2.4 gene annotations. The data file used is in the ExtenalDataSets module.
This module was used to examine the distribution of intron sizes in tomato. This was done to determine the best maximum intron parameter for aligning reads with tophat.
- Ann Loraine firstname.lastname@example.org
- Ivory Clabaugh Blakley email@example.com
Copyright (c) University of North Carolina at Charlotte
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Also, see: http://opensource.org/licenses/MIT