Latest Updates and News from Genenetwork
2016-10-03: Notes on new BXD names.
- BXD73, BXD73a (original known as BXD80), and BXD73b (originally known as BXD103) are genetically very similar. BXD73 and BXD80 are genetically identical at 82264 of 100290 markers (82% identical by descent). BXD73 keeps its original name and JAX identifier number (JR#7117), whereas BXD80 is now referred to as BXD73a (JR#7124). BXD73 and BXD103 are genetically identical at 90917 of 100290 markers (90.6% identical by descent). BXD103 is now referred to as BXD73b (JR#7146).
- BXD48 and BXD48a (originally known as BXD96) are sister substrains, and are genetically identical at 93485 of 100290 markers (93.2% identical by descent). BXD48 retains its original name and JAX identifier number (JAX JR#7097) whereas BXD96 is now referred to as BXD48a (JR#7139).
- BXD65, BXD65a (originally known as BXD97), and BXD65b (originally known as BXD92) are sister substrains. BXD65 and BXD97 are genetically identical at 92225 of 100290 markers (92% identical by descent). BXD97 is now referred to as BXD65a (JR#7140). BXD65 and BXD92 are genetically identical at 6155 of 6459 markers (95.3% identical by descent). BXD65 retains its original name and JAX identifier (JR#7110) whereas BXD92 is now referred to as BXD65b (JR#9677).
2016-10-02: BXD Genotypes file status (October 2016)
In September and October 2016, Robert Williams, Jesse Ingels, Lu Lu, and Danny Arends released new genotype files for many of the original BXD strains (BXD1 through BXD102), and for all of the new strains (BXD104 to BXD220). Genotypes were generated at about 74,000 SNPs using the GigaMUGA array developed by Drs. Fernando Pardo-Manuel de Villena (University of North Carolina) and Gary Churchill (The Jackson Laboratory). Genotypes were generated at GeneSeek (Neogen Inc) with financial support from the University of Tennessee Center for Integrative and Translational Genomics.
The new genotypes are now available in GeneNetwork as the 2017 Genotype file. All SNPs were mapped to the older July 2007 mm9 NCBI Build 37 assembly and to the newer Dec 2011, mm10, GRCm38 assembly. Of the 74,000 GigaMUGA and many other markers we have typed, only about 6800 are useful (informative) in defining recombination events in the current set of BXDs. These informative markers either define unique recombination patterns across the 198 BXD strains (including extinct strains) or they define the proximal and distal ends of regions that do not contain any known recombinations in the BXD family.
The file does not include any markers for Chr Y or the mitochondrion.
As of October 2016 GeneNetwork still uses mm9 coordinates for mapping functions.
(Updated Oct 2, 2016 by RW Williams)
2016-09-29: TIGEM Human Retina RNA-Seq (Sep16) RPKM log2 profiling by array entered into GeneNetwork.
This dataset is public available, please contact Michele Pinelli at Telethon Institute of Genetics and Medicine (TIGEM) for further information. (Implemented by Michele Pinelli, Arthur Centeno and Rob Williams).
2016-07-26: Raw and processed data from our Hippocampus studies based on publication "Genetics of the hippocampal transcriptome in mouse: a systematic survey and online neurogenomics resource" can be accessed now through GEO series GSE84767.
2016-07-16: Gene expression data for 39 inbred mice strains for CD4+ T cells profiling by array entered into GeneNetwork.
GEO Series: GSE60337. To determine the breadth and underpinning of changes in immunocyte gene expression due to genetic variation in mice, we performed, as part of the Immunological Genome Project, gene expression profiling for CD4(+) T cells and neutrophils purified from 39 inbred strains of the Mouse Phenome Database.
2016-05-16: FGUCAS BXH/HXB Brown Adipose Affy Rat Gene 2.0 ST (May16) log2 profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Michal Pravenec Czech Academy of Sciences for further information. (Implemented by Michal Pravenec, Arthur Centeno and Rob Williams).
2016-04-16: EPFL/ETHZ BXD Brown Adipose Proteome (Apr16) profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Arthur Centeno and Rob Williams).
2016-04-16: DoD TATRC Retina Affy MoGene 2.0 ST profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Eldon Geisert Emory Eye Center for further information. (Implemented by Eldon Geisert, Arthur Centeno and Rob Williams).
2016-04-16: VCU BXD NAc EtOH vs CIE Air M430 2.0 profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Michael Miles Virginia Commonwealth University for further information. (Implemented by Michael Miles, Arthur Centeno and Rob Williams).
2016-02-16: UCAMC LXS Whole Brain Saline and Ethanol RNA Sequence FPKM profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Richard Radcliffe University of Colorado at Denver for further information. (Implemented by Richard Radcliffe, Arthur Centeno and Rob Williams).
2016-02-16: UTHSC Hippocampus Illumina v6.1 NOS, NOE, RSS, and RSE data sets profiling by array entered into GeneNetwork.
This is untreated control "Base" group gene expression data for the hippocampus of BXD strains of mice (n = 27 strains and n = 35 animals). These data NON data are useful as baseline for comparison with NOS, NOE, RSS, and RSE data sets. NON = NON = No stress and no saline control injection; NOS = No restraint stress and given only saline injections prior to sacrifice; NOE = No restraint stress and given an ethanol injection prior to sacrifice; RSS = short restraint stress (1 episode) followed by a saline injection; and finally, RSE = Restraint stress followed by an ethanol injection.
For more details on the precise experimental paradigm, please see Ziebarth et al 2010 or the original paper that used this paradigm by Kerns RT, Ravindranathan A, Hassan S, Cage MP, York T, Williams RW, Miles MF (2005) Ethanol-responsive brain region expression networks: implications for behavioral responses to acute ethanol in DBA/2J versus C57BL/6J mice. Journal of Neuroscience 25: 2255-2266.Please contact Rob Williams University of Tennessee Health Science Center for further information. (Implemented by Lu Lu, Arthur Centeno and Rob Williams).
2016-02-16: EPFL/LISP BXD CD+HFD Subcutaneous WAT Affy MTA 1.0 GeneLevel Main (Feb16) RMA profiling by array entered into GeneNetwork.
The BXD genetic reference population is a recombinant inbred panel descended from crosses between the C57BL/6 (B6) and DBA/2 (D2) strains of mice, which segregate for about 5 million sequence variants. Recently, some these variants have been established with effects on general metabolic phenotypes such as glucose response and bone strength. In this study, we examined both genetic variants across 40 strains of BXD and the two founder lines, in addition to a major environmental influence—long term feeding with chow diet (CD) or high fat diet (HFD)—to see how metabolic gene expression varies by genotype and environment, and gene-by-environment interactions. The basic heart phenotypes quantified in these cohorts were not affected by HFD feeding (e.g. blood pressure and heart rate).
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Arthur Centeno and Rob Williams).
2015-12-09: INIA LCM (11 Regions) RNA-seq Transcript Level (Dec15) profiling by array entered into GeneNetwork.
This dataset is now public, please contact Megan Mulligan University of Tennessee Health Science Center for further information. (Implemented by Megan Mulligan, Arthur Centeno and Rob Williams).
2015-12-07: UTHSC Affy MoGene 1.0 ST Spleen profiling by array entered into GeneNetwork.
This is a preliminary release with known errors of a spleen gene expression data set generated by a DOD-funded consortium (Byrne, Kotb, Williams, and Lu). Please contact Lu Lu or Robert Williams regarding status of this data set.
Animals were generated at UTHSC by Lu Lu and colleagues. The spleen of untreated young adult mice was profiled using the Affymetrix GeneChip Mouse Gene 1.0 ST array that contains approximately 34,728 probe sets that target approximately 29,000 well defined transcripts (RefSeq mRNA isoforms) and essentially all known protein coding genes in mouse. This array is an "exon style" array with multiple probes in all known exons of each gene (an average of about 27 per gene) and is an abridged version of the Affymetrix Exon 1.0 ST array. However, it also does contain some probes that target non-coding RNAs and even miRNA precursors (search "ncrna").This dataset is now public, please contact Robert Williams University of Tennessee Health Science Center for further information. (Implemented by Lu Lu, Arthur Centeno and Rob Williams).
2015-11-10: Hippocampus Mouse Transcriptome Assay 1.0 GeneLevel Main (Nov15) RMA profiling by array entered into GeneNetwork.
Please contact Megan Mulligan University of Tennessee Health Science Center for further information. (Implemented by Megan Mulligan, Arthur Centeno and Rob Williams).
2015-09-14: The Genotype-Tissue Expression (GTEx) is providing a comprehensive atlas of gene expression and regulation across 53 human tissues. The latest GTEx data release (dbGaP release phs000424.v5.p1) is now available on GeneNetwork.
Note: Please disregard confusing legend provided in Study tab of the study page
which claims that study contains 552 subjects with genotypes - those totals
are counts of subjects with all molecular data types, not just molecular
More detailed representation of subject counts in molecular datasets
(including genotypes) may be found in 'Molecular Data' tab at Common Fund (CF) Genotype-Tissue Expression Project (GTEx) dbGaP Study Accession: phs000424.v5.p1.
2014-10-10: BXD Elicited Peritoneal Neutrophils Gene Expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Marko Radic University of Tennessee Health Science Center for further information. (Implemented by Marko Radic, Indira Neeli, Teruki Hagiwara, Arthur Centeno and Rob Williams).
2014-08-29: Mouse Diversity Panel Hippocampus Antidepressant profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Brooke Miller University of Florida for further information. (Implemented by Brooke Miller, Arthur Centeno and Rob Williams).
- UFL MDP Hippocampus Antidepressant Affy Mouse 430 2.0 (Aug14) RMA **
2014-08-10: RTI RCMRC BXD Fecal Metabolites CD+HFD (Aug14) Log2 profiling by array entered into GeneNetwork.
The goal of this project is to study expression of a large set of defined and undefined metabolites in fecal samples from a genetically diverse set of BXD mouse strains (females) raised either on a high fat diet or low fat diet. Samples were taken from animals at sacrifices directly from the large intestine.
Aim: Effect of genetics and diet on fecal metabolomics
Use UPLC-MS Broad Spectrum Metabolomics to study:Mice fecal samples
To capture the most signal, all samples were analyzed under positive and negative ion mode using a reversed-phase separation. Please contact Robert Clark RTI International for further information. (Implemented by Robert Clark, Arthur Centeno and Rob Williams).
2014-05-02: BXD Liver, Soluble Proteins CD and Hi-Fat Gene Expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Evan Williams, Arthur Centeno and Rob Williams).
- EPFL/ETHZ BXD Liver, Soluble Proteins CD (May14) SWATH **
- EPFL/ETHZ BXD Liver, Soluble Proteins HFD (May14) SWATH **
2014-04-03: UTHSC Mouse BXD Gastrointestinal Affy MoGene 1.0 ST Gene Level (Apr14) RMA profiling by array entered into GeneNetwork.
- UTHSC Mouse BXD Gastrointestinal Affy MoGene 1.0 ST Gene Level (Apr14) RMA **
Currently this dataset is confidential, please contact Dennis D Black UT Le Bonheur Pediatric Specialists for further information. (Implemented by Dennis D Black, Arthur Centeno and Rob Williams).
2014-03-07: The Genotype-Tissue Expression (GTEx) is providing a comprehensive atlas of gene expression and regulation across multiple human tissues. The latest GTEx data release (dbGaP release phs000424.v4.p1) is now available on GeneNetwork.
The Genotype-Tissue Expression (GTEx) project is a collaborative effort that aims to identify correlations between genotype and tissue-specific gene expression levels that will help identify regions of the genome that influence whether and how much a gene is expressed. GTEx is funded through the Common Fund, and managed by the NIH Office of the Director in partnership with the National Human Genome Research Institute, National Institute of Mental Health, the National Cancer Institute, the National Center for Biotechnology Information at the National Library of Medicine, the National Heart, Lung and Blood Institute, the National Institute on Drug Abuse, and the National Institute of Neurological Diseases and Stroke, all part of NIH. This series of 837 samples represents multiple tissues collected from 102 GTEX donors and 1 control cell line. In total, 30 tissue sites are represented including Adipose, Artery, Heart, Lung, Whole Blood, Muscle, Skin, and 11 brain subregions. RNA-seq expression data, robust clinical data, pathological annotations, and genotypes are also available for these samples from dbGaP (http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000424.v2.p1) and the GTEx portal (www.broadinstitute.org/gtex). While GTEx is no longer generating Affymetrix expression data, donor enrollment continues and is expected to reach 1,000 by the end of 2015. Updates to the GTEx data in dbGaP and the GTEx Portal will be made periodically. contributor: GTEx Laboratory, Data Analysis, and Coordinating Center (LDACC) contributor: The Broad Institute of MIT and Harvard (LDACC PIs: Kristin Ardlie and Gaddy Getz).
WU-Minn HCP Consortium Open Access Data Use Terms
2014-01-31: BXD Heart Polar Metabolites Gene Expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Evan Williams, Arthur Centeno and Rob Williams).
- EPFL/LISP BXD CD Heart Affy Mouse Gene 2.0 ST Gene Level (Jan14) RMA **
- EPFL/LISP BXD HFD Heart Affy Mouse Gene 2.0 ST Gene Level (Jan14) RMA **
- EPFL/LISP BXD CD Heart Affy Mouse Gene 2.0 ST Exon Level (Jan14) RMA **
- EPFL/LISP BXD HFD Heart Affy Mouse Gene 2.0 ST Exon Level (Jan14) RMA **
2014-01-30: GEO Series GSE15745 Abundant Quantitative Trait Loci for CpG Methylation and Expression Across Human Brain Tissues.
Currently this dataset is in progress, please contact J Raphael Gibbs National Institutue on Aging, NIH for further information. Data originated from GTEx (Genotype-Tissue Expression) eQTL Browser (Implemented by J Raphael Gibbs, Arthur Centeno and Rob Williams).
- GSE15745 NIH Human Brain Cerebellum ILM humanRef-8 v2.0 (May10) RankInv
- GSE15745 NIH Human Brain Prefrontal Cortex ILM humanRef-8 v2.0 (May10) RankInv
- GSE15745 NIH Human Brain Temporal Cerebral ILM humanRef-8 v2.0 (May10) RankInv
- GSE15745 NIH Human Brain Pons ILM humanRef-8 v2.0 (May10) RankInv
2014-01-06: BXD Mouse Retina Blast Gene expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Eldon Geisert Emory Eye Center for further information. (Implemented by Eldon Geisert, Justin Templeton, Arthur Centeno and Rob Williams).
- DoD TATRC Retina Blast Affy MoGene 2.0 ST (Dec13) RMA Gene Level **
- DoD TATRC Retina Blast Affy MoGene 2.0 ST (Dec13) RMA Exon Level **
2013-12-16: BXD Liver Polar Metabolites and BXD Muscle Polar Metabolites Gene expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Evan Williams, Arthur Centeno and Rob Williams).
- EPFL/LISP BXD Liver Polar Metabolites CD+HFD (Dec13) **
- EPFL/LISP BXD Liver Polar Metabolites HFD (Dec13) **
- EPFL/LISP BXD Liver Polar Metabolites CD (Dec13) **
- EPFL/LISP BXD Muscle Polar Metabolites CD+HFD (Dec13) **
- EPFL/LISP BXD Muscle Polar Metabolites HFD (Dec13) **
- EPFL/LISP BXD Muscle Polar Metabolites CD (Dec13) **
2013-12-02: BXD Mouse Retina Gene expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Eldon Geisert Hamilton Eye Institute Department of Ophthalmology for further information. (Implemented by Eldon Geisert, Justin Templeton, Arthur Centeno and Rob Williams).
- DoD TATRC Retina Affy MoGene 2.0 ST (Dec13) RMA Gene Level **
- DoD TATRC Retina Affy MoGene 2.0 ST (Dec13) RMA Exon Level **
2013-11-19: HZI PR8M-Infected Lungs Females RNAseq (Nov13) RPKM ** entered into GeneNetwork. Currently this dataset is confidential, please contact Klaus Schughart Department Experimental Mouse Genetics for further information. (Implemented by Klaus Schughart, Ashutosh Pandey, Arthur Centeno and Rob Williams).
2013-10-28: BXD CD Brown Adipose Gene expression profiling by array entered into GeneNetwork.
Currently this dataset is confidential, please contact Johan Auwerx Ecole Polytechnique Federale de Lausanne for further information. (Implemented by Johan Auwerx, Evan Williams, Arthur Centeno and Rob Williams).
- EPFL/LISP BXD CD Brown Adipose Affy Mouse Gene 2.0 ST Gene Level (Oct13) RMA **
- EPFL/LISP BXD CD Brown Adipose Affy Mouse Gene 2.0 ST Exon Level (Oct13) RMA **
2013-09-23: Gene expression changes in the course of normal brain aging are sexually dimorphic. Expression profiling by array (Platform: GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array) entered into GeneNetwork.
PMID: 18832152. GEO Series GSE11882. (Implemented by Nicole Claudia Berchtold, Arthur Centeno and Rob Williams).
- GSE11882 UCI Human Entorhinal Cortex Affy U133 Plus2 (Sep13) RMA
- GSE11882 UCI Human Hippocampus Affy U133 Plus2 (Sep13) RMA
- GSE11882 UCI Human Postcentral Gyrus Affy U133 Plus2 (Sep13) RMA
- GSE11882 UCI Human Superior Frontal Gyrus Affy U133 Plus2 (Sep13) RMA
GeneNetwork 10-year anniversary. We now have a decade of expression data in GeneNetwork (first mRNA data is a BXD Whole Brain U74Av2 Aug03). Here are some current statistics:
- 437 expression data sets for all species. Many of these are "variants on a theme" such as Male, Female, Combined, or PDNN, RMA, MAS5 transformations.
- 240 expression data sets for the BXD family of mice The redundancy rate is probably about 3x (three versions of every primary data set).
- 13.7 million mRNA traits. Again, figure about 3x redundancy. That is ~4 million molecular assays across the family.
- 685 million mRNA strain-phenotype values for the BXD family. That means that we 50 strains on average per trait (685/13.7). For most of these we also have standard error terms, so we have over a billion data points for the BXD family.
SEARCH FOR GENES BY AUTHOR. You can now search gene expression data sets using an author query. The search is not perfect yet, but try these queries in the search boxes:
We plan to refine this type of query and to add a query for Institutions (INST=Yale). (Implemented by Lei Yan).
- NAME=(Watson JD)
- NAME=(Snyder) will work but NAME=(Snyder M) does not yet work
- NAME=(Synder) is a typo, and will not work
- NAME=(Williams LU) is interpreted as two different authors (RW Williams and L Lu in this case).
2013-08-02: Experimental INIA Hypothalamus Affy MoGene 1.0 ST (Nov10) PCA (principal component analysis) v080813 entered into GeneNetwork. Variance adjusted and denoised by Ashutosh Pandey (Implemented by Khyobeni Mozhui, Adrienne Adler, Jesse Ingels, Lu Lu, Ashutosh Pandey, Arthur Centeno and Rob Williams).
Several features have been implemented or are currently being implemented for GeneNetwork 2:
Future updates will continue to be posted in the news as they're completed.
A new interface for user registration is being developed for GeneNetwork 2, as well as an interface with which users can designate groups of data-sets and assign read/write privileges.
(Implemented by Sam Ockman and Zach Sloan)
The correlation page has been mostly re-implemented. While the output is the same, all features re-implemented in GN2 have been almost entirely rewritten using much clearer business logic and a framework and templating language (Flask and Jinja2 respectively).
(Implemented by Lei Yan and Zach Sloan)
The marker regression function has been re-implemented using the PyLMM code written by Nick Furlotte. This code is much more robust than the code currently used in the marker regression function of GN1 and accounts for kinship between samples/strains.
(Implemented by Zach Sloan, Sam Ockman, and with code written by Nick Furlotte)
We added a link from GeneNetwork to BNW (Bayesian Network Web Server) at GeneNetwork collection page.
(Implemented by Lei Yan).
2013-06-13: Whole-genome gene expression profiles of non-tumorous human lung tissues GEO Series GSE23546 entered into GeneNetwork. This SuperSeries is composed of the SubSeries: GSE23352 (Laval set: 499 samples), GSE23529 (UBC set: 405 samples), GSE23545 (GRNG set: 445 samples), please contact Yohan Bossé or Ke Hao for further information. Platform used: GPL10379 Rosetta/Merck Human RSTA Custom Affymetrix 2.0 microarray. (Implemented by Yohan Bossé, Ke Hao and Arthur Centeno).
When editing gene aliases, aliases can be split by comma, semicolon, vertical bar, space, line break. And then they all will be converted to semicolon (NCBI style).
(Implemented by Lei Yan).
2013-04-13: DoD TATRC Retina Affy MoGene 2.0 ST Array (April 2013) RMA entered in GeneNetwork. This data set consists of 19 BXD strains, and DBA/2J parental strain. The data are now closed and not available for analysis.
This is Robust Multi-array Average (RMA) expression data that has been normalized using what we call a 2z+8 scale, without special correction for batch effects. The data for each strain was computed as the mean of four samples per strain. Expression values on a log2 scale range from 6.252 to 18.07 (11.83 units), a range of approximately 3600-fold. After taking the log2 of the original non-logged expression estimates, we convert data within an array to a z score. We then multiply the z score by 2. Finally, we add 8 units to ensure that no values are negative. The result is a scale with a mean of 8 units and a standard deviation of 2 units. A two-fold difference in expression is equivalent roughly to 1 unit on this scale. (Implemented by Eldon E. Geisert and Arthur Centeno).
2013-04-06: PLINK procedure upgraded:
The PLINK calculation procedure was upgraded, so now the mapping time with PLINK is reduced to less than half. For Human CANDLE data, it took about 400+ seconds before, now it will take no more than 200 seconds. (Implemented by Lei Yan).
2013-03-21: New GeneNetwork UCSC Genome Browser mirror:
We have installed a second GeneNetwork UCSC Genome Browser mirror site that supports mm10 and mm9. And we loaded DBA/2J Sequence and Structural Variants and C57BL6/NJ Sequence and Structural Variants for mm10. (Implemented by Lei Yan and Ashutosh Pandey).
2013-01-18: The Scripps Research Institute (TSRI) Dorsal Root Ganglia (DRG) Affy Mouse Genome 430 2.0 (Jan13) RMA Mouse Diversity Panel (MDP) entered into GeneNetwork. Currently this dataset is confidential, please contact Andrew Su for further information. (Implemented by Andrew Su and Arthur Centeno).
2013-01-08: Users can now chose the number of permutations that are used to estimate empirical P values for mapping results. We typically default to 2000 permutations, but if you need more precise estimates of P values it is possible to request up to 100,000 permutations. On a good day and without many other active users on GeneNetwork, the system can handle 1000 permutations per second. A histogram of the highest LRS scores for each of the many permutations is now displayed again in the Marker Regression output window. (Implemented by Zach Sloan and Lei Yan).
2012-11-08: First mappable RNA-seq data set in GeneNetwork for 29 genotypes of mice (BXD family) for whole brain RNA samples. We used a RiboMinus rather than polyA+ protocol. Approximately 15 million unique and aligned reads per sample were generated and used to compute the number of mRNAs (RPKM, normalized reads per per 1 kb of gene model per million tags). This data set summarizes values on a per GENE basis. We will be adding EXON level analysis in the next few weeks. To the best of our knowledge this is the first accessible and directly mappable RNA-seq data set available on the Internet. (Implemented by Khyobeni Mozhui, Arthur Centeno, David Li, Lu Lu, William Taylor, and Xusheng Wang).
2012-10-01: The core GeneNetwork code is now being rewritten in Python 2.7 using the Flask framework and Jinja2 HTML templating. Progress can be monitored on the GN GitHub site. Our goal is to produce an much more extensible and modular version of GeneNetwork (GN2.0). GN2.0 will initially not have nearly as many features as GN1.0 but it will be used as the base for all new coding. We expect to port GN2.0 to Python 3.3+ late in 2013 or early 2014. (Implemented by Lei Yan, Zach Sloan, Sam Ockman, with advice from Christian Fernandez).
2012-09-01: BXD Hypothalamic-Pituitary-Adrenal axis data sets have been completed and added to GeneNetwork. These data sets are accessible to all users without registration at two different levels of analysis--whole gene estimates of transcript expression, and single exon estimates of expression. The three primary data sets have now been error checked, and corrected. The hypothalamus data have been split by sex. PMID: 22593731. GEO Series GSE36674. (Implemented by Khyobeni Mozhui, Adrienne Adler, Jesse Ingels, Arthur Centeno, Lu Lu, and Rob Williams).
2012-07-20: BXD Pituitary gene expression data added to GeneNetwork. This is part of a systematic genetic analysis of the hypothalamic-pituitary-adrenal axis in the mouse (~5 strains). Data for hypothalamus and pituitary are complete and error checked. Data for the adrenals are still be error-checked. (Implemented by Khyobeni Mozhui, Adrienne Adler, Jesse Ingels, Arthur Centeno, Lu Lu, and Rob Williams).
2012-07-01: New beta SEARCH feature by NAME (testing stage) now allows searching for genes/transcripts by the name of authors and scientists who have worked set of genes. The search "name=rakic" or "name=(williams rw)" will find genes that have been linked to articles by "Rakic" of "RW Williams" in PubMed. We are still working on the interface for this new search type. Send any suggestions for improvements or problems that you encounter. (Implemented by Lei Yan and Ashutosh Pandey).
2012-06-04: GeneNetwork Development and Source Control now using GitHub . We have switched over from developing GeneNetwork using Apache Subversion (SVN) software versioning system to GitHub. GitHub (Implemented by Lei Yan with in collaboration with from Christian Fernandez and Sam Ockman).
2012-05-20: All GeneNetwork code is available on GitHub (Implemented by Lei Yan with in collaboration with from Christian Fernandez and Sam Ockman).
2012-05-04: Second Developmental Studies of the Genetics of Gene Expression in Brain have been entered into GeneNetwork (BIDMC/UTHSC Dev Neocortex P3 and P14 ILMv6.2 (Nov11) RankInv). Dr. Glenn Rosen and colleagues have contributed data on gene expression across sets of 32 BXD strains for the neocortex and striatum at two stages of development (postnatal days 3 and 14). They used the Illumina Mouse Genome 6 version 2 array. These data are matched by previous data sets for the adult neocortex and striatum. These data are now publicly available but users are requested to contact Glenn D. Rosen regarding the status of these new data. (Implemented by G Rosen, RW Williams and A Centeno).
2012-01-20: Mouse SNPs from dbSNP have been added to GeneNetwork. 10 million mouse SNPs from dbSNP (build 128) have been added to Variant Browser. They could be searched by name (e.g. rs31192936) (Implemented by Xiaodong Zhou and Ning Liu).
2012-01-20: Literature correlation has been update to 2011 version. Dr. Ramin Homayouni and Dr. Lijing Xu kindly provide the 2011 version of mouse gene-gene literature correlation matrix to GeneNetwork. (Implemented by Xiaodong Zhou).
2011-12-16: Expression data set for EPFL/LISP BXD Muscle Affy Mouse Gene 1.0 ST (Dec11) RMA ** has been entered in GeneNetwork. Laboratory of Integrative and Systems Physiology (LISP). This data set is not yet freely available for global analysis. This data set has not yet been used or described in any publication. Please contact Johan Auwerx or Evan Williams at firstname.lastname@example.org regarding use of these data. (Implemented by J Auwerx, E Williams, LA Rose, RW Williams and A Centeno).
2011-9-29: We have added liver gene expression data for many strains of mice from GEO series GSE16780. These data were generated by Dr. Jake Lusis and colleagues at UCLA and are currently listed as a BXD data set, although the study actually includes many other strains (see "GSE16780 UCLA Hybrid MDP Liver Affy HT M430A (Sep11) RMA". Since adding the data we have discovered errors in strain assignment that affect a majority of the conventional inbred strains and several RI strains. For this reason, these data should still be considered provisional. For complete information please refer to A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res 2010 Feb;20(2):281-90. PMID: 20054062 (Implemented by Bennett BJ, Ghazalpour A. Data entered on 9-29-11 by A. Centeno under Accession number:GN373).
2011-9-17: The QTLminer feature was put into GeneNetwork production server.(Implemented by Rudi Alberts, Lei Yan, Ning Liu, Xiaodong Zhou)
2011-7-20: GeneNetwork was moved to Amazon Cloud (EC2). (Implemented by Lei Yan)
2011-7-12: A SourceForge site for GeneNetwork was built: https://sourceforge.net/projects/genenetwork/. (Implemented by Lei Yan, Robert Williams)
2011-7-11: A new account on our SVN server for GeneNetwork sharing was created. Anybody can get the latest version of GN source codes by checking out (this account cannot commit).
(Implemented by Lei Yan)
2011-7-5: Harvard Brain Tissue Resource Center/ Merck Research Laboratories. This study aims at identifying functional variation in the human genome (especially as it pertains to brain expressed RNAs) and elucidate its relationship to disease and drug response. The ~800 individuals in this dataset are composed of approximately 400 Alzheimers disease (AD) cases, 230 Huntington's Disease and 170 controls matched for age, gender, and post mortem interval (PMI). The tissue specimens for this study were provided by Harvard Brain Tissue Resource Center (HBTRC).
Three brain regions (cerebellum, visual cortex, and dorsolateral prefrontal cortex) from the same individuals were profiled on a custom-made Agilent 44K microarray of 39,280 DNA probes uniquely targeting 37,585 known and predicted genes, including splice variants, miRNAs and high-confidence non-coding RNA sequences. The individuals were genotyped on two different platforms, the Illumina HumanHap650Y array and a custom Perlegen 300K array (a focused panel for detection of singleton SNPs). Clinical outcomes available include age at onset, age at death, Braak scores (AD), Vonsattel scores (HD), Regional brain enlargement/atrophy.
Acknowledgements. The Harvard Brain dataset was contributed by Merck Pharmaceutical through the Sage Bionetworks Repository. The tissues were provided by Harvard Brain Tissue Resource Center which is supported in part by PHS grant R24 MH068855 (http://www.brainbank.mclean.org/). Investigators: Francine Benes/ Eric Schadt. (Implemented by Megan Mulligan, Rob Williams and Arthur Centeno).
2011-6-16: CANDLE Study expression data entered in GN. The primary goal of the CANDLE study is to study factors that affect brain development in young children. To this end, the current study will test specific hypotheses regarding factors that may negatively influence cognitive development in children.
For information on genomic and genetic studies related to CANDLE, please contact: Drs. Ronald M. Adkins (radkins1 at uthsc.edu) and Julia Krushkal (jkrushka at uthsc.edu). This data set is currently confidential. (Implemented by Khyobeni Mozhui, Rob Williams and Arthur Centeno).
2011-6-6: Confidential Phenotype Trait Feature: We added one new feature of confidential phenotype trait to GeneNetwork. (Implemented by Xiaodong Zhou).
2011-6-6: SNP INDEL Variant Browser updated: We have greatly expanded and improved the SNP INDEL variant browser that is built into GeneNetwork. This resource enables users to rapidly review both known and confidently imputed sequence variants in the mouse genome. The data set includes over 65 million SNPs that are largely taken from sequencing efforts of David Adams and colleagues at the Sanger Institute and our own team at UT. The imputation of SNPs to other strains was carried out by Eleazar Eskin, Nick Furlotte, and colleagues at UCLA. (Implemented by Ning Liu and Xiaodong Zhou).
2011-6-6: Major Overhaul of Trait Data pages: The Trait Data and Analysis page has been redesigned to reduce complexity and visual clutter. Functions have not been changed, but may have been moved (Implemented by Zachary Sloan, Xiaodong Zhou, and Rob Williams).
2011-6-6: Upgraded GeneNetwork Hardware: We are converting GeneNetwork to MySQL master-slave replication with faster solid-state hard drives to improve performance. (Implemented by Lei Yan).
2011-3-21: Sample blocking: The user can now block individual samples/strains in the Trait Data and Analysis page by typing either an individual index number or a range (ex: 1,2,3,10-20). This feature was created to eliminate the need for a user to manually replace each sample's value with 'x'. (Implemented by Zachary Sloan).
2011-3-18: User login status: The user login status is shown in all dynamically generated pages by making use of session mechanism through entire GN system. (Implemented by Xiaodong Zhou).
2011-2-23: More space for your Trait Collections: We have greatly expanded the number of traits, transcripts, genes, and markers be added to your collections. The current limit is now 3,000; up from 100 in the previous version. This improvement was achieved by storing collection information using a different and more secure method (session control rather than cookies) (Implemented by Xiaodong Zhou).
2011-2-9: INIA Amygdala BLA Affy MoGene 1.0 ST (Nov10):
This is a final error-checked release of an amygdala gene expression data set generated by Khyobeni Mozhui, Lu Lu, and Robert W. Williams and colleagues with funding support from the NIH NIAAA. The basolateral complex of the amygdala of untreated young adult mice was profiled using the Affymetrix GeneChip Mouse Gene 1.0 ST array that contains approximately 34,728 probe sets that target approximately 29,000 well defined transcripts (RefSeq mRNA isoforms) and essentially all known protein coding genes in mouse. This array is an "exon style" array with multiple probes in all known exons of each gene (an average of about 27 per gene) and is an abridged version of the Affymetrix Exon 1.0 ST array. However, it also does contain some probes that target non-coding RNAs and even miRNA precursors (search "ncrna").
This data set is not yet freely available for global analysis. This data set has not yet been used or described in any publication. Please contact Robert W. Williams at email@example.com regarding use of these data.
2011-2-9: INIA Hypothalamus Affy MoGene 1.0 ST (Nov10):
These hypothalamic gene expression data were generated by Khyobeni Mozhui, Lu Lu, and Robert W. Williams and colleagues with funding support from NIAAA. The data set includes samples from 50 strains, including 46 BXDs, both parental strains, and reciprocal F1 hybrids. Expression data were generated using the Affymetrix Mouse Gene 1.0 ST exon-style microarray (multiple probes in all known exons) by Lorne Rose in the UTHSC Molecular Resources Center (MRC), Memphis TN. The table below provides a summary of cases, sex, and age. Hypothalamic tissue was dissected by K. Mozhui (description to follow) with special attention to time of day (every sample has time stamp). RNA was extracted by K. Mozhui. All other processing steps by the UTHSC MRC by L. Rose. Data were processed by Arthur Centeno.
This data set is not yet freely available for global analysis. This data set has not yet been used or described in any publication. Please contact Robert W. Williams at firstname.lastname@example.org regarding use of these data.
2011-2-7: GenEx BXD EtOH Liver Affy M430 2.0 (Jan11) RMA **: Data set generated with support of NIAAA by Drs. Robert Rooney (Genome Explorations Inc.), Divyen Patel (Genome Explorations Inc.), and Kristin Hamre (UTHSC). All animals were on standard chow and water ad lib. Both the saline control group and the ethanol=treated group were given solutions via intragastric gavage with controls getting saline and the alcohol-treated group getting 6g/kg of ethanol. Ethanol-treated mice were generally comatose for 4-6 hrs but were responsive and moving by the next morning. Tissue was collected at 24 hours after the initial infusion.
This data set is not yet freely available for global analysis. This data set has not yet been used or described in any publication. Please contact Robert W. Williams at email@example.com regarding use of these data.
2011-2-3: OHSU HS-CC Striatum ILM6v1 (Feb11) RankInv: The current study focused on the extent genetic diversity within a species (Mus musculus) affects gene
co-expression network structure. To examine this issue, we have created a new mouse resource, a heterogeneous
stock (HS) formed from the same eight inbred strains that have been used to create the collaborative cross (CC). The
eight inbred strains capture > 90% of the genetic diversity available within the species. For contrast with the HS-CC, a
C57BL/6J (B6) × DBA/2J (D2) F2 intercross and the HS4, derived from crossing the B6, D2, BALB/cJ and LP/J strains,
were used. Brain (striatum) gene expression data were obtained using the Illumina Mouse WG 6.1 array, and the data
sets were interrogated using a weighted gene co-expression network analysis (WGCNA).
Read full article: Genetic diversity and striatal gene networks: focus on the heterogeneous stock-collaborative cross (HS-CC) mouse.
2011-1-26: Auwerx Lab BXD Phenotype Data: We have entered the first large-scale metabolic, cardiovascular, and clinical chemistry data sets (n = 143 phenotypes) for the BXD strains of mice. Data are averages for males and females separately, for as many as 43 strains. Data were generated using animals born at UTHSC (Memphis) and phenotyped in Strasbourg, France in 2008 using EMPReSS Slim EUMORPHIA standard operating protocols. Blood pressure phenotypes were included in a PLoS Genetics paper in 2009, but we now provide the complete phenotypes from this cohort of animals. (Phenotyping by Hana Koutinkova, Johan Auwerx and colleagues; data processing by H Koutnikova, RW Williams, EG Williams, and Xiaodong Zhou).
2011-1-26: Expression data for the prefrontal cortex of the BXD strains have added into GN, mice from each genotype received 4 weekly cycles of chronic intermittent ethanol (CIE) vapor exposure (EtOH group) or air exposure (CTL group) in inhalation chambers. The general study design was generated by Dr. Michael Miles and colleagues. All data sets are currently being tested. Contact Dr. Miles at VCU Medical Center for access (Implemented by M Miles, RW Williams and A Centeno).
- VCU BXD PFC CIE Air M430 2.0 (Jan11) RMA **
- VCU BXD PFC CIE EtOH M430 2.0 (Jan11) RMA **
- VCU BXD PFC EtOH vs CIE Air M430 2.0 (Jan11) Sscore **
2011-1-14: NCSU Expression data set for Drosophila melanogaster have been added to GeneNetwork. For more information about The Drosophila Genetic Reference Panel click here. (Julien F. Ayroles, Trudy F. C. Mackay).
- NCSU Drosophila Whole Body (Jan11) RMA
2011-1-11: Genome Explorations Expression data sets for Liver have been added to GeneNetwork. (Implemented by B Rooney, K Hamre, RW Williams and A Centeno).
- GenEx BXD Sal Liver Affy M430 2.0 (Jan11) RMA Females **
- GenEx BXD Sal Liver Affy M430 2.0 (Jan11) RMA Males **
- GenEx BXD Sal Liver Affy M430 2.0 (Jan11) RMA Both Sexes **
- GenEx BXD EtOH Liver Affy M430 2.0 (Jan11) RMA Females **
- GenEx BXD EtOH Liver Affy M430 2.0 (Jan11) RMA Males **
- GenEx BXD EtOH Liver Affy M430 2.0 (Jan11) RMA Both Sexes **
2010-12-22: Phenotype traits can be searched by LRS. For instance, search by LRS=(23 46) or LRS=(9 99 Chr4 122 155). (Implemented by Xiaodong Zhou).
2010-12-20: The final quality controlled release of a spleen gene expression data set generated by a DOD-funded consortium (Byrne, Kotb, Williams, and Lu) has been entered in GeneNetwork.
- UTHSC Affy MoGene 1.0 ST Spleen (Dec10) RMA
We built a new Cluster system. Four nodes were deployed. Every node is Dell PowerEdge R815, has 48 cores, 64G RAM, 1.3G RAM/core. CPU is AMD 1.86G Hz. The headmaster connects one Dell PowerVault MD1000 (15*2TB hard drives, RAID5).
And Galaxy system was installed on the Cluster system.
(Implemented by Lei Yan).
2010-10-29: First Developmental Studies of the Genetics of Gene Expression in Brain have been entered into GeneNetwork. Dr. Glenn Rosen and colleagues have contributed data on gene expression across sets of 32 BXD strains for the neocortex and striatum at two stages of development (postnatal days 3 and 14). They used the Illumina Mouse Genome 6 version 2 array. These data are matched by previous data sets for the adult neocortex and striatum. These data are open but users are requested to contact Glenn D. Rosen regarding the status of these new data. (Implemented by G Rosen, RW Williams and A Centeno).
2010-10-29: Large RNA-seq data for BXD Whole Brain is being added to our GeneNetwork version of the UCSC Genome Browser. We are still loading these data and eventually will display RNA-seq data for over 30 strains of mice. Gene level summary of these data will be entered into GeneNetwork for quantitative analysis later this year. Please contact RW Williams regarding the status of these new data. (Implemented by David Li, Lu Lu, Xusheng Wang, Lei Yan, and RW Williams).
2010-9-23: Genotype data for mouse strains BXD101, BXD102, and BXD103 have been added to GeneNetwork. These data were extracted from a large scale regenotyping of the BXD strains done using the Mouse Diversity array from Affymetrix that was designed by Pardo and Churchill. (Implemented by Ning Liu and Xiaodong Zhou).
2010-9-20: Five groups of gene expression data for the hippocampus of BXD strains of mice have been entered in GeneNetwork. NON = No stress and no saline control injection; NOS = No restraint stress and given only saline injections prior to sacrifice; NOE = No restraint stress and given an ethanol injection prior to sacrifice; RSS = short restraint stress (1 episode) followed by a saline injection; and finally, RSE = Restraint stress followed by an ethanol injection. (Implemented by Lu Lu, RW Williams and A Centeno).
2010-9-13: There is now a good Wikipedia entry for GeneNetwork. Please check, correct, and improve. (Implemented by RW Williams).
2010-9-13: GeneNetwork source code is now licenses using the Affero General Public License version 3. A SourceForge site will be set in the next few months. In the interim, please contact us directly for code. (Implemented by Xiaodong Zhou and RW Williams).
2010-08-02: We have installed a GeneNetwork UCSC Genome Browser mirror site that displays a set of 4.58 million SNPs that distinguish strain DBA/2J from the reference strain, C57BL/6J. Other DNA sequence and RNA-seq data sets will be added over the next several months. (Implemented by Lei Yan and Xusheng Wang).
2010-08-02: Tissue Correlation and Expression Level Services: We have added a new web interface that allows you to directly evaluate differences in gene expression across 30 different tissue types.
The current version is mainly meant for testing. The interface still needs work.We have a long list of improvements in the works, but please send us your ideas. To test drive the tissue correlation feature you currently need to enter a set of mouse GeneID numbers. For example App is NCBI Enbrez Gene ID 11820. Bace is Gene ID 23821. If you enter these two numbers in the interface and then click your heels twice. You should get back a simple matrix of values that lists both Pearson and Spearman correlations based on a comparison of expression in 25 tissue types. Click on the correlation values and this will pop up two scatterplots (Pearson and Spearman types).
At the same time, this tool provides expression estimates for both genes, where a value of 8 in the Pearson plot represents the mean across all tissues and each unit represents a two-fold difference (log2 expression; Spearman rank values are just that--rank out of 25). To access this new feature select Search -> Tissue Correlation. All of the expression data are taken from on C57BL/6J litter mates studied using the Illumina Mouse 6 2.0 array (Implemented by Ning Liu, data from Lu Lu, RW Williams, and Xusheng Wang).
2010-07-02: Microarray annotations have been improved and will soon be more consistent across platforms. We now synchronize the gene level annotation of probes and probe sets. When gene level attributes such as gene symbols, alias, name, and other identifiers, are changed, the change is applied to all other probes and probe sets with the same Gene Id. (Implemented by Xiaodong Zhou).
2010-07-02: We have added Homologene identifiers to help in comparative analysis. (Implemented by Xiaodong Zhou)
2010-07-02: Improved icons and GUI for selecting and analyzing GeneNetwork data sets. The use of icons enable fast recognition of functions and is also more compatible with touch screen interfaces. There are three types of icons:
This GUI was implemented by Zach Sloan.
- Selection tools with a grey background
- GeneNetwork analysis tools with blue background
- External resources analysis tools with a clear background
2010-07-02: We have improved interface for the partial correlations, Now both the zero order and higher order correlations use identical sample sizes for more direct comparisons. We have also added several checking procedures to help you avoid undesirable results. (Implemented by Xiaodong Zhou).
See More News
2010-06-29: A new GeneNetwork Wiki (MediaWiki) was built, and the old GeneNetwork Wiki (TWiki) was frozen for security issues. (Implemented by Lei Yan).
2010-06-29: PDF and tour of GeneNetwork presented at the Oak Ridge National Laboratory that is useful as a primer and good source of images for your own work or presentations. (Implemented by R Williams).
2010-06-18: Mouse brain and eye RNA-seq data displays on the UCSC Genome Browser have been upgraded by Xusheng Wang for C57BL/6J and DBA/2J strains of mice.
Sequence tag counts are plotted on a Log2 Y axis over a range from 2 to 10 (n = 2^2 to 2^10). (Implemented by Xusheng Wang, David Li, and colleagues).
- brain and eye combined
2010-05-18: Oxford Heterogeneous Stock Expression Data Sets for Hippocampus, Lung, and Liver have been added to GeneNetwork. These are large (n > 250), powerful, and well structured data sets from a single cohort of highly recombinant outcrossed mice that have a genetic architecture similar to that of human populations. All tissue was taken from non-inbred mice generated by systematically outcrossing a stock of eight inbred strains by Robert Hitzemann and colleagues (A/J, AKR/J, BALBc/J, CBA/J, C3H/HeJ, C57BL/6J, DBA/2J, and LP/J). Please refer to the publication by Huang and colleagues (2008).
The Oxford group (Jonathan Flint, Richard Mott, and colleagues) maintains a complete data repository and key summary data that can be accessed at
All correlation and network graphing functions are currently implemented in GeneNetwork. You can generate correlation matrices, graphs, and compute partial correlations. However, QTL mapping has not yet been implemented for these three data sets. Traits have been mapped by the Oxford group and QTLs can be viewed using GSCANDB. We are currently working with Dr. Richard Mott and colleagues to include a real-time R Happy package in GeneNetwork that will enable users to map the original trait data and derivatives. (Implemented by Arthur Centeno, Rob Williams, and Jonathan Flint).
2010-05-18: Partial Correlations can now be calculated in GeneNetwork. We have integrated R Project code to compute up to third order partial correlations. This makes it possible to compute the correlation between X and Y traits while controlling for the possible confounding effects of three other variables, for example the age of cases, experimental groups, or even a genetic marker. To use the Partial Correlation feature, please add X, Y, and the possible confounder traits or markers into a Collection. Then just look for the Partial Correlation icon and text. (Implemented by Xiaodong Zhou).
2010-05-01: Initial RNA-seq data for brains and eyes of C57BL/6J and DBA/2J strains. We are now experimenting with RNA-seq expression data generated using SOLiD short (50 nt) reads. Links to initial RNA-seq data are provided for brain and eye. Data are plotted on a Log2 Y axis over a range from 2 to 10 (2^2 to 2^10 tags). (Implemented by Xusheng Wang).
2010-04-08: We have added new options for scatterplots. Font size, marker type and size, and colors can now be changed. As before, we output both types of correlations. (Implemented by Zachary Sloan).
Legend: Examples of scatterplots of Apoe and Aplp1 expression in the BXD hippocampus using both Pearson (left) and Spearman rank (right) correlations (50% of original size).
2010-03-31: First RNA-seq data available on the UCSC Genome Browser. The RNA-seq data were generated using whole brain mRNA (polyadenylated) from a total of 12 animals: 4 C57BL/6J young adults (2 males and 2 females), 4 DBA2/J young adults (also 2M and 2F), 2 B6D2F1 (1M and 1F), and 2 D2B6F1 (1M and 1F).
(Implemented by Xusheng Wang, Lu Lu, William Taylor, RWW, and David Li)
2010-03-31: Added search "tip text" to the main search page. (Implemented by Lei Yan)
2010-03-23: Rank correlation plots have been computed and implemented from Correlation Table output pages and Correlation Matrix output plots. We now routinely output both types of correlations for diagnostic purposes. Users can change the plot size and labeling. More output options are in progress. (Implemented by Zachary Sloan)
2010-03-11: Both dynamic and static web pages now include a separate header and footer that administrators can edit from the manager interface. (Implemented by Lei Yan)
2010-03-08: The interval mapping function for Mouse Diversity Panel (MDP) group has been fixed. (Implemented by Xiaodong Zhou)
2010-01-08: The Network Graph page has been improved. Several new graph algorithm options are provided. We added one function called "Lock Graph Structure" that allows the user to hold the position of all nodes and the length of all edges constant, letting him/her easily compare between different correlation types. In the image, the nodes and lines between them are "hot linked" now so that user can easily check each trait and correlation. User can also export the raw data (traits and correlations) in different format as input to other network graph software such as Cytoscape. We also fixed a bug where the user was unable to correctly select an edge's color. (Implemented by Zachary Sloan and Xiaodong Zhou)
2009-12-04: Search page has been redesigned. The new design provides more help to new users. (Implemented by RW Williams)
2009-11-20: TouchGraph's Navigator software has an interesting applet that lets you see the relationship between GeneNetwork and other sites on the web. Click here to check it out.
2009-11-12: Swiss GeneNetwork site is on-line as of November 16. This mirror site is in the Laboratory for Integrative Systems Biology at the EPFL (Prof Johan Auwerx and colleagues) in Lausanne (Implementation by Evan Williams, Lei Yan, and Johan Auwerx.)
2009-11-6: Fly Toxicogenomic data sets are expected to be available on the GN production web site in early December 2009. We are currently experimenting with two data sets generated by Douglas Ruden and colleagues:
Raw data were provided by Dr. Grier Page and can be found on the GeneNetwork development server. (Implementation by Arthur Centeno, Robert W. Williams, Doug Ruden, Grier Page, and Xiaodong Zhou.)
- UAB Whole body D.m. mRNA control (Oct09) RMA
- UAB Whole body D.m. mRNA lead acetate (pbAc) (Oct09) RMA
2009-10-27: Fixed links to PKU SynDB. (The API interface methods was changed from GET to POST.) (Implementation by Lei Yan)
2009-09-24: A GeneNetwork Archive/Time Machine server has be set up. This systems allows users to work with older versions of GeneNetwork that correspond well with specific publications. Each version is designated by a time stamp (2009-03-04, for the March 4, 2009 version). (Implementation by Lei Yan)
2009-09-21: We have added 150 behavioral traits that are related to anxiety levels in the BXD strains of mice into GeneNetwork. Mice fall into five different treatment groups: baseline (BASE), treated with a saline control injection (no stress, only saline; NOS), treated with an ethanol (no stress, only ethanol; NOE), treated to mild restraint stress followed by a saline injection (RSS), treated to mild restraint stress followed by an ethanol injection (RSE). Each of these five sets of 30 behavioral assays is matched to a corresponding hippocampal gene expression data set (Illumina). Please search for "Melloni" in the BXD Phenotypes database ANY or ALL fields. (Implementation by Xiaodong Zhou, Melloni Cook, and Lu Lu.)
2009-09-09: Unit testing of all GeneNetwork servers and mirrors has been implemented. This system monitors the performance of six different GN functions using Get commands. Success is evaluated by comparing the length of the returned html document with expectation. When a system fails, an error message is sent to the Sys Admin. (Implementation by Lei Yan)
2009-09-09: Barley gene expression data has being integrated into GeneNetwork. All data are from Matthew Moscou, Roger Wise and Nick Lauter at the Iowa State University. Data is currently public with the correspondent acknowlegment of data use and disclaimer, and those interested in obtaining full access should contact moscou at iastate.edu (Implementation by Matthew Moscou, Robert W. Williams, and Arthur Centeno)
2009-09-09: We have added a simple web interface used by data managers to delete phenotype trait. (Implementation by Xiaodong Zhou)
2009-08-24: PARTIAL CORRELATIONS have been implemented in GeneNetwork on the beta test site and will soon be move to our production site. You can now compute correlations between a "reference trait" and any other large set of traits while controlling for variability associated with up to three other factors. For example, it is possible to compute the correlation between the expression of the gene formin 2 (Fmn2) and all other 45,000 probe sets in the BXD hippocampus data set while controlling for the effects of the genotype at the location of the Fmn2 gene itself. This is done by using the SNP rs6375522 that is located in Fmn2as a controlled cofactor. In this particular situation the partial correlation then provides you with a measure of the association between differences in formin 2 expression and all other transcripts in the absence of genetic variation in formin 2 itself. Partial correlations are a great way to study correlations between genes with control for sometimes unwanted sources of variability--linkage with neighboring genes being one important unwanted source. Partial correlations can also be used to remove effects of sex, age, batch effect, etc. Users can upload their own "correction" factors that can be used to calculate partial correlations. (Implementation by David Crowell)
2009-08-20: SNP BROWSER UPGRADE. We have reworked and renamed the SNP Browser module of GeneNetwork. The new name is Variant Browser, a change we made because we are adding data on indels (about 140,000 so far). All indel data is from our own comparative whole-genome sequence analysis of indels between C57BL/6J and DBA/2J. The figure below, for example, shows four indels in the mouse Hc gene. The Variant Browser is currently only useful for the analysis of mouse genomes, but we hope to provide useful links and tools from other species in the next few years. (Implementation by Evan G. Williams)
Legend: Example of the new Variant Browser illustrating results of an indel search for hemolytic complement (Hc).
2009-08-12: A massive human brain expression data set (363 cases) for the study of Late-Onset Alzheimer's Disease (LOAD) with age-matched elderly control subjects has been added to GeneNetwork. This cortical expression data set is taken from the work of Dr. Amanda Myers and colleagues (see GEO GSE15222, Webster et al., 2009). (Implementation by Robert W. Williams and Arthur Centeno)
2009-08-10: Service upgrade: We have developed a program that monitors free space on GeneNetwork servers. When hard drive space gets too tight, email notification is sent to system administrators. (Implementation by Lei Yan)
2009-08-10: The QTL Heat Map and the QTL Cluster Map have been significantly improved. The QTL heat map now allows the user to reorder traits with much greater flexiblity, using all of the sorting functions that are built into the Trait Collection window (sort by position, sort by database, sort by symbol, sort by LRS, etc.). By default, the order of traits in the Traits Collection window is used by the QTL heat map. In addition, the QTL heat maps can be clustered, as usual, by trait correlations--the Cluster Map function developed originall by Elissa Chesler. We have made some minor changes to the heat map display that allow figures to be oriented horizontally. (Implementation by Xiaodong Zhou.)
2009-07-29: We have fixed the "Set to Default" function so that it works properly with any species on GeneNetWork. (Implementation by Xiaodong Zhou)
2009-07-27: All mouse genome position data on GeneNetwork have been updated from mouse assembly mm8 to assembly mm9. This has affected the position of the genes, probes, and SNPs. The mm8 assembly remains on the mirrors, temporarily. (Implementation by Evan Williams)
2009-07-27: We have fixed the Principal Component Analysis (PCA) function to handle as many traits as computationally feasible given the sample size and numbers of cases. This improved PCA code is used when computing correlation matrices. (Implementation by Xiaodong Zhou)
2009-07-26: Human brain expression data in patients with Alzheimer's disease and age-matched elderly control subjects. This cortical expression data set is taken from GEO GSE5281 (Liang et al. 2006). Samples were laser-captured from cortical layer 3 (except the hippocampus) and run on the Affymetrix U133 Plus 2.0 array. We renormalized the data to an average expression of 8 units on a log2 scale. Two versions of the data have been entered in GeneNetwork: one consisting of 157 of 161 arrays; the second consisting of what we regard as the best 102 arrays (those with mean correlations of better than 0.88 with all other arrays). Case IDs have the following code structure: Brain Region, GEO ID, Sex, Age, Disease Status. E119615M63N is a sample of the entorhinal cortex of case GSM119615, a male 63 year old normal case. The tissue codes are E = enorhinal cortex, H = hippocampus pyramidal layer, MT = medial temporal cortex, PC = porterior cingulate cortex, SP = supeior frontal cortex, V = primary visual cortex. GeneNetwork does not allow sophisticated display of the data, but you can perform correlation analyses of any of the 56,000 probe sets. For example expression of the APP transcript is higher in the AD cases and correlates well with many other AD related genes. At least 7.5% of cases are assigned incorrectly by sex (see INFO file) (Implementation by Robert W. Williams and Arthur Centeno)
2009-07-24: We have added annotation for the Affymetrix U133 Plus 2.0 human microarray (GPL570). Annotation file download from NetAffx July 2009. Minor reannotation by Arthur Centeno prior to entry into GeneNetwork. This annotation file will be used with GSE5281 (steph-affy-human-433773), a data set of gene expression in six brain regions from normal and Alzheimer's disease patients.
2009-07-22: We have finished the first phase of sequencing the genome of DBA/2J at UTHSC (SOLiD) and UCLA (Solexa) and have extracted a set of 2.8 million SNPs with comparatively high quality scores that differ between DBA/2J and the C57BL/6J. Many of these SNPs are novel. Some overlap with Celera and Perlegen data. All SNPs have been added to the GeneNetwork SNP browser using the laboratory identifier symbols "MRS1xxxxxxx" label. (Implementation by Evan Williams and Xusheng Wang)
2009-07-21: Imputed alleles has been added for 6 million SNPs and a panel of 74 strains in the GeneNetwork SNP browser. Data were generated by the Genome Dynamics group at the Jackson Laboratory using the Perlegen data set as a reference. The GN SNP browser tables are now more significantly larger in terms of both mouse strain and SNP coverage. The figure below illustrates results for Zim3 in the GN SNP browser. (Implementation by Evan Williams)
Legend: Example of the updated SNP browser; SNP search results for Zim3
2009-07-20: SymAtlas links in GeneNetwork has been replaced by BioGPS links. These new links provide summary data on expression of genes in 50 or more tissues in seveal species. BioGPS also provides data on expression QTLs. We thank Dr. Andrew Su and colleagues at the GNF. (Implemented by Xiaodong Zhou)
GeneNetwork is integrated with the Gene-set Cohesion Analysis Tool (GCAT). GCAT determines the functional coherence of gene sets by performing latent semantic analysis of Medline abstracts. To try this new function, select a set of genes/transcripts in one of the GeneNetwork Correlation Tables and then click on the GCAT button (upper left). (Implementation by Xiaodong Zhou, Lijing Xu, Ramin Homayouni and colleagues at the University of Memphis)
Legend: GCAT output graph of genes associated with Comt
2009-06-20: Expression data for the ventral tegmental area (VTA) of the midbrain has been added to the BXD panel of mouse strains. Three related data sets--saline control, acute ethanol, and an ethanol vs saline contrast S scores--were generated by Dr. Michael Miles and colleagues. All data sets are currently being tested. Contact Dr. Miles at VCU Medical Center for access (Implemented by Arthur Centeno, Xiaodong Zhou, and Michael Miles)
2009-06-14: We have upgraded the Network Graph (Association Network) functions in GN. You can now change color and thickness of lines (edges) that connect nodes (usually transcripts or phenotypes). Each node of these graphs is hyperlinked to the underlying data. (Implemented by Lei Yan)
Legend: Example of the type of network graph that can be generated using both expression and phenotype data sets.
2009-06-05: Upgraded the annotation to the Affymetrix Rat RAE 230 and RAE 230 2.0 microarrays. We have also incorporated the probe sequence data for the newer array. Annotation for the rat Exon 1.0 ST array is also in progress. (Implemented by Xioadong Zhou and RWW)
2009-04-28: The edit HTML function was changed from cgi technique to mod_python. Our goal is to eventually change all the few functions in GN that still use CGI technique to mod_python (Implementation by Xiaodong Zhou)
2009-04-28: In each probeset info file, one link was added (Accession number: GNxxx) so that user can download the raw data of this particular dataset. (Implementation by Xiaodong Zhou)
2009-04-24: Information on polymorphisms in micro RNA targets taken from the PolymiRTS Database have been added to interval map tables in GeneNetwork. (Implementation by Lei Yan)
2009-04-15: Microarray Annotation Files. We have begun to make all of the manually curated array annotation files used in GeneNetwork available here. (Implementation by Lei Yan, Arthur Centeno, Xusheng Wang, and Xiaodong Zhou)
2009-04-15: GN genotype data improvement. We have begun to revise all genotype database tables and data sets in anticipation of adding large mouse, rat, and human genotype data sets in the next few months. We have reconciled discrepancies between physical and genetic maps for all species and developed a new method to produce special files used for interval mapping (so-called "geno" files). (Implementation by Xiaodong Zhou and Rob Williams)
2009-04-15: GeneNetwork Database Improvement. All datga (phenotype, genotype, and expression data) in GeneNetwork have previously been entered into a single massive table (data table) in fully normal form. To improve performance we have now split quantitative data for individuals, cases, and strains into four related tables: 1. PublishData (classic phenotype data), 2. GenoData (genotypes), 3. ProbeSetData (gene expression data), and 4. ProbeData (individual probe expression data). The reason to do this is that these data types are used and updated in different ways. An original table that held all error term data (the SE table) has also been separated in the same way. Our intent is to improve SQL query performance and reduce recovery time. (Implemented by Xiaodong Zhou)
2009-04-07: Usage Statistics for Lily, one of the main GeneNetwork server are now available. GeneNetwork runs on a set of servers and we occasionally rotate the production server and staging server (currently named StatisticsLily and StatisticsProust). (Implemented by Lei Yan)
2009-02-20: We are making some GUI interface improvements to the Genome Graph interface. (Implemented by Lei Yan)
2009-02-20: Trait Correlation Upgrade. The most important innovation from the user's point-of-view is a fast method to compute correlations among transcripts that reduces the response latency about 20-fold--from 90 seconds to under 5 seconds using standard Affymetrix and Illumina data sets consisting of 45000 probes or probe sets. Of even greater importance, this improvement means that GN can now handle massive Affymetrix Exon 1.0 ST data sets that have over 1.1 million probe sets in about 60 seconds. The new method exploits a set of text files that are external to the database (essentially a materialized view), parallel computing technique (Parallel Python) and optimized SQL queries. (Implementation by Xiaodong Zhou and David Crowell).
2009-02-18: Data Security Upgrade. New data security system has been active since Feb 18, 2009. You only have access to the confidential dataset to which you are assigned. If you have any questions, please contact Dr. Robert Williams or Xiaodong Zhou. (Implementation by Xiaodong Zhou and Hongqiang Li).
2009-2-13: Add link to Ontological Discovery Environment (Implemented by Lei Yan)
2009-01-20: Human and Rat Trait Correlation pages have been upgraded to included data on tissue correlations and literature correlations. (Implementation by Xiaodong Zhou).
2009-01-16: Advanced Search GUI The GUI for advanced search function has been developed. This GUI is much more user friendly than the old text interface (Implementation by Lei Yan).
2009-01-08: Link to WebGestalt 2.0 The links in GN to WebGestalt have been switched to its new edition WebGestalt 2.0 (Implementation by Lei Yan).
2008-12-30: The UML class diagrams for GN python code have been finished. The inheritance among classes and dependency among modules are also represented by UML diagrams. (Implemented by Xiaodong)
2008-12-30: The UML database diagram for GN database has been finished (Implemented by Hongqiang)
2008-12-23: Heritability values have been added to the BXD Eye HEIMED mRNA data set. These data are shown in the Basic Statistics page for this particular data set. We hope to routinely add this type of information for new array data as they are added to GeneNetwork. (Implementation by Lei Yan and Xusheng Wang).
2008-12-19: Array Annotation Query, Download, and Annotate module has been added to GeneNetwork. (Implementation by Lei Yan and Xusheng Wang).
2008-12-18: New rat HXB/BXH gene expression data from Norbert Hubner and colleagues is being entered into GeneNetwork. The following four tissue types are being added: adrenal glands, soleus muscle, heart, and liver. These data are still experimental but will be opened in 2009.
2008-12-4: Tissue Correlation scores for the great majority of genes have been added to GeneNetwork. A high tissue correlation between two genes indicates that they tend to be expressed together across a set of diverse tissues and organs. Tissue Correlations can be computed from most Trait Data and Analysis pages by selecting Tissue Correlation, Pearson's r or Tissue Correlation, Spearman's rho using the Trait Correlation analysis tool. (Implementation by Xiaodong Zhou and Xusheng Wang)
2008-12-4: University of Western Australia GN mirror site is up and running in the laboratory of Dr. Grant Morahan. This system built and maintained by Munish Mehta. Munish also has an active code development program with novel tools and resources.
2008-12-3: The JAX Mouse Diversity Genotyping array is being used to regenotype the BXD strains. This array was designed by Fernando Pard-Manuel de Villena and Gary Churchill. The array generates as many as 625,000 SNPs and 900,000 invariant genomic probes (one SNP about every 4.3 kb, see JAX Notes, Winter 2008, No 512). The genotypes be integrated into GeneNetwork in the next three months. (Implementation for GN by Lu Lu, R. Williams, and Hongqiang Li).
2008-11-28: Gene-set Cohesion Analysis Tool has been integrated into GN. GCAT Home (Implemented by Xiaodong Zhou)
2008-11-10: One new server Lily has been installed and configured as GN main server. (Implemented by Lei Yan)
2008-10-20: First human expression data set for lymphoblasts (Epstein-Bar virus immortalized B-cells) from the CEPH panel (large Mormon families) have been integrated into GeneNetwork. The data are original from a study by Monks and colleagues (2004). Please note: These data can currently be used to study patterns of coexpression among transcripts, but we have not yet implemented mapping algorithms and do not expect to make interval mapping available until summer 2009. We thank Stephanie Santorico for providing her data and help. (Implemented by Hongqiang Li, Stephanie Monks Santorico, Xusheng Wang, and Arthur Centeno.)
2008-10-15: We have uploaded four LXS expression data sets for the mouse hippocampus (NOS, NOE, RSS, and RSE; n = 33 strains per data set generated using Illumina Mouse WG-6.1 Beadarray). The four sets are part of a study on the effects of ethanol and stress on brain gene expression. Please see the INFO files associated with these data sets for more background, for example, the "no restraint-saline" condition (NOS). (Implemented by Lu Lu and Arthur Centeno.)
2008-10-05: Major new BXD behavioral and drug-related phenotypes for 62 strains have been integrated into GeneNetwork by a large research consortium funded by the National Institute on Drug Abuse and the National Institute for Alcohol Abuse and Alcoholism. A total of 242 phenotypes were measured in collaboration with the Systems Genetics Group at the Oak Ridge National Laboratory. These records can be searched by selecting the BXD Phenotypes data set and entering the search string "Chesler." Please request permission from Drs. EJ Chesler and D Goldowitz to use these data. Primary phenotypic data on individual mice are available in the Mouse Track System. (Implemented by Elissa Chesler (firstname.lastname@example.org), Philip VM, Ansah TA, Blaha CD, Cook MN, Hamre KM, Lariviere WR, Matthews DB, Mittleman G, Goldowitz D)
2008-10-04: UCLA Department of Genetics GeneNetwork mirror site is now online. Data in this site is synchronized periodically (once per month) with the Tennessee production site. (Implemented by Evan G. Williams and Kev Adler.)
2008-10-05: An XML schema for expression genetic data sets of the type used by GeneNetwork has been generated by Ilze Druka and colleagues. This schema is based in part on the GeneNetwork database. (Implemented by Ilze Druka)
2008-9-15: GN code improvement by cleaning up "namespace pollution" Every GN python module used to import other module by "from SomeModule import *" so that the class, function and variable of the imported module can be used without the imported module name as prefix. Since one typical GN python module imports over ten modules, and GN python modules always import other GN python modules like chain (B import A, C import B, D import C ...), the namespace is seriously polluted. It not only makes the source code very difficult to read (for anything not defined in current module, programmer may have look up dozens of imported modules to find where it is defined), but also makes the dependency among modules unclear, so the change to one module always has "unpredictable" effects to other modules. As GN software keeps growing, the namespace pollution problem keeps building up and make it harder and harder to maintain the software and develop new features. A lot of effort has been made to change all GN python module to import module by "import SomeModule", and add the module's name as prefix to the the class, function and variable of the imported module (thousands of places). This work not only greatly improves the source code readability, but also make the dependency among GN python modules clear, hence much easier for maintenance and future development. (Implemented by Xiaodong Zhou)
2008-07-29: GeneNetwork TWiki site has been moved to a new server. (Implemented by Kev Adler)
GeneNetwork Roundup bug tracking site has been moved to a new server. (Implemented by Hongqiang Li)
2008-07-30: Massive UCLA microarray data sets for four F2 populations and four tissues have been integrated into GeneNetwork. Data are now available for up to 300 animals for each of these tissues:
Data sets were generated by Jake Lusis, Eric Schadt, and colleagues and include one of the first expression genetics study published in 2003 (Mouse BDF2 UCLA). Papers that describe several of these new data sets have been published (Yang et al. 2006; Schadt et al., 2008) and several massive data sets are open access (Mouse BHF2-Apoe UCLA, from GEO GSE2814, GSE3086, GSE3087, and GSE3088). Several other data sets are still being analyzed. For early access to still unpublished data sets (e.g., BH/HB F2 UCLA and CastB6/B6CastF2 UCLA) please contact Dr. Aldons Jake Lusis and colleagues. (Implemented by Evan Williams)
- Adipose tissue (see 2007)
- Brain (see 2006)
- Liver (see 2007)
- Muscle (see 2010 and GEO series GSE12795)
2008-08-29: EYE data: The Mouse HEIMED whole eye gene expression data set has been extended to 99 strains, including 27 common inbred strains. With support from Dr. Barrett Haik, Eldon Geisert, Lu Lu and colleagues we have added data for 15 new strains of mice. Arrays were processed at the UTHSC by Weikuan Gu and colleagues. This data set is open for use without a password. (Implemented by RWW, Daniel Ciobanu, Lu Lu, and Arthur Centeno)
2008-07-15: The GeneNetwork MySQL configurations have been optimized, resulting in much faster and more reliable service. (Implemented by Kev Adler)
2008-06-20: GeneNetwork is now part of the NIH NCRR Biomedical Informatics Research Network (BIRN) with a fully configured BIRN equipment rack. Our thanks to BIRN-Coordinating Center and to the IT staff at UTHCS (Implemented by Bao Nguyen, Mark James, James Martin, Billy Hatcher, and Kev Adler)
2008-05-20: Neocortex: With support from the High Q Foundation, we have added a matched neocortex (cerebral cortex) gene expression data set for 73 strains of mice to accompany the striatum data set highlighted below. The neocortex data set includes estimates of expression for 20 common inbred strains, 52 BXD strains, B6D2F1 hybrids generated using the Illumina Mouse-6 v1.1 Sentrix array. Samples were generated by Glenn Rosen, Lu Lu, and colleagues. Arrays were processed at the UTHSC. This data set is open for use without a password. (Implemented by HS, RWW, Lu Lu, and Arthur Centeno)
2008-05-20: Striatum: With support from the High Q Foundation, we have added a striatum (caudate-putamen) gene expression data set for 75 strains of mice. The data set includes estimates of expression for 20 common inbred strains, 54 BXD strains, B6D2F1 hybrids generated using the Illumina Mouse-6 v1.1 Sentrix array. Samples were generated by Glenn Rosen, Lu Lu, and colleagues. Arrays were processed at the UTHSC. This data set is open for use without a password. (Implemented by HS, RWW, Lu Lu, and Arthur Centeno)
2008-04-20: Lung: We have added lung gene expression data set for 57 strains of mice. The data set includes estimates of expression for 8 common inbred strains, 47 BXD strains, and reciprocal F1 hybrids (B6D2F1 and D2B6F1) generated using the M430 2.0 Affymetrix array. Samples were generated by Lu Lu and colleagues. Arrays were processed by Yan Jiao and Weikuan Gu at the Memphis VA. This data set is still provisional and not available without a password. If you would like early access, please contact Prof. Klaus Schughart (Helmholtz Centre for Infection Research, Braunschweig, Germany) at email@example.com. (Implemented by HS, RWW, Lu Lu, and Arthur Centeno)
2008-04-10: Nucleus Accumbens: Dr. Michael Miles and colleagues have added gene expression data for the nucleus accumbens of the BXD strains into GN. The nucleus accumbens is an important part of the brain involved in emotional state and reward. It is also critically involved in drug abuse and alcoholism. Three complementary data sets have been submitted: expression in nucleus accumbens following a saline control injection, expression in nucleus accumbens following an injection of ethanol, and a data set that highlights the difference in expression between the two conditions (saline and ethanol). Each data set includes estimates of expression for 35 BXD strains, as well as C57BL/6J and DBA/2J. Samples were generated and processed by Miles and colleagues. The saline control data set is available without a password, but the two other data sets are still available only with a password. If you would like early access, please contact Prof. Michael F. Miles (Virginia Commonwealth University) at firstname.lastname@example.org. (Implemented by MM and Arthur Centeno.)
2008-04-10: Prefrontal Cortex: Dr. Michael Miles and colleagues have added gene expression data for the prefrontal cortex of the BXD strains into GN. The prefrontal cortex (PFC, or prelimbic neocortex) is an important part of the brain involved in emotional state and reward. It is also critically involved in drug abuse and alcoholism. Three complementary data sets have been submitted: expression in PFC following a saline control injection, expression in nucleus accumbens following an injection of ethanol, and a data set that highlights the difference in expression between the two conditions (saline and ethanol). Each data set includes estimates of expression for 35 BXD strains, as well as C57BL/6J and DBA/2J. Samples were generated and processed by Miles and colleagues. The saline control data set is available without a password, but the two other data sets are still available only with a password. If you would like early access, please contact Prof. Michael F. Miles (Virginia Commonwealth University) at email@example.com. (Implemented by MM and Arthur Centeno.)
2008-03-26: Dr. Fan Zhang has left the GN software development group. He has spent the past five months substantially reworking the architecture of GN hardware, rewriting the code to increase is portability, and in improving security. Thanks Fan for your many contributions and good luck back home in China.
2008-03-17: Two Affymetrix Mouse EXON ST 1.0 array data sets for the hippocampus (n = 84 strains) and striatum (n = 48 strains), including a variety of common inbred strains and numerous BXD strains have being integrated this week into GeneNetwork (see Mouse--BXD--Hippocampus and Striatum). These data were generated with the support of Affymetrix, Dr. David Kulp, and the High Q Foundation. Annotation on these two large data sets is in progress. (Implemented by Manjunatha Jagalur, Arthur Centeno, Xusheng Wang, Lu Lu, and Hongqiang Li).
2008-03-14: GeneNetwork site has been moved to two new Dell PowerEdge 1950 and 2950 8-core servers to provide more capacity and higher performance. (Implemented by Fan Zhang).
2008-03-2: GeneNetwork site usage is now being studied using Google Analytics. We hope this will allow us to improve usage and performance for GN clients. (Implemented by Fan Zhang).
2008-02-17: We have finished synchronizing the updated GN code between the UTHSC GN main site and the HZI GeneNetwork mirror. This is still a manual process that needs to be done once every few months. (Implemented by Fan Zhang and Rudi Albert).
2008-02-14: We have implemented a system that improves the synchronization of the main production database and with the set of servers that are part of the cluster. These servers are referred to by our development team as GN server "bundles". (Implemented by Fan Zhang and Kev Adler).
2008-01-20: GeneNetwork consists of a cluster of servers that have identical software code and identical databases. The creation of the individual servers (bundles) that are part of the GN cluster is quite complicated. Each bundle consists of an application stack (Linux CENT OS 5, MySQL 5.0, Apache HTTP server, Mod_Python, and many code libraries and scripts). Components of the bundles ideally work well regardless of the physical hardware and network configuration. We have now rewritten and annotated the GN code and reconfigured bundles to make them as independent as possible from their particular network situation. We refer to this effort as "de-hardcoding" or refactoring GN. GN bundles are now easy to be deployed with minimum manual work. The following tasks still need to be done: configuration of IP addresses, specification of absolute directory structure as DocumentRoot for Apache, recompilation of certain third-party Python libraries according to hardware infrastructure (i386 vs X64_86).(Implemented by Fan Zhang). The number of bundle servers has been increased to six.
GN Bundle Setup
2008-01-12: Probe level data has been added into the UCHSC BXD Whole Brain M430 2.0 November 2006 RMA data set. Affymetrix data sets in GeneNetwork always include the higher level "probe set" data, but in the case of this important whole brain data set from The University of Colorado we have also entered all of the individual probe level data. This means that users can drill down to examine the expression of individual 25-mer probes. To access the individual probe data please click on the PROBE TOOL button on the Trait Data and Analysis page of GN. Then click on any individual probe; the left hand column. (Implemented by Arthur Centeno, Hongqiang Li, and Daniel Ciobanu.)
2008-01-08: We have added a large Mouse Striatum Expression data set into GeneNetwork. This data set includes replicate estimates of expression in the dorsal striatum for 75 strains of mice (54 BXDs and 21 common strains). All data were generated using the new Illumina Mouse 6.1 bead array with support from the High Q Foundation and the NIH. A matched neocortical expression data set will be uploaded into GN in the next few months. (Implemented by Glenn Rosen, Lu Lu, Arthur Centeno, Hongqiang Li, and Rob Williams.)
Legend: Expression of the dopamine D1a receptor mRNA in the striatum from the new Illumina data set. Each bar provides expression values (log2 transformed values) for a single strain.
2007-12-08: John Stuart and colleagues have uploaded a Spleen M430 2.0 Expression data set into GeneNetwork for the CXB recombinant inbred strains and the two parents of this RI set 9C57BL/6By and BALB/cBy). This data set includes replicate estimates of expression for the whole spleen. (Implemented by Lu Lu, Arthur Centeno, Hongqiang Li, and Rob Williams.)
2007-10-20: In progress: GeneNetwork developers are working closely with the Biomedical Informatics Research Network to map GN data sets and metadata into OWL/RDF concepts. (Implemented by Rob Williams, Fan Zhang, and Bill Bug.)
2007-10-19: GeneNetwork codebase has been more throughly documented/annotated over the past several weeks by Kev Adler and Fan Zhang. Many important changes have been made to abstract the code and its calls to external resources (Implemented by Fan Zhang.)
2007-10-16: The performance of MySQL database as the data vault has been increased dramatically. The consumed time for running a standard correlation mapping has been decreased from 3-4 minutes down to less than 2 seconds. (Implemented by Fan Zhang.)
2007-10-16: A small Java application has been written to handle early steps in the processing and error-checking of microarray data sets that heading into GeneNetwork. The program is still a beta version and is available upon request from Hongqiang Li. It is called ArrayPipeliner. (Implemented by Hongqiang Li.)
2007-10-12: GeneNetwork has been moved to a new cluster. The GN MySQL relational database is now running on a Dell 2950. Our apologies for inconvenience associated with this move. (Implemented by Fan Zhang and Hongqiang Li.)
2007-10-24: A preliminary version Standard Operating Procedures for entering new microarray data sets into GeneNetwork are now available on the GeneNetwork TWiki pages (Implemented by Arthur Centeno, Hongqiang Li, and Rob Wiliams.)
2007-10-02: The Illumina mouse microarray annotation file has been substantially updated and extended by Xusheng Wang, Hongqiang Li, and Rob Williams. The new annotation file covers four variant arrays: Mouse 6, Mouse 6.1, Mouse 8, and Mouse 8.1. This annotation file is used by the Mouse LXS Hippocampus data (Mouse 6) and by three new Mouse BXD Hematopoietic cell data sets (Mouse 6.1) generated by Gerald de Haan's and colleagues. The Illumina Annotation files are available for users of GeneNetwork, but are not yet freely as a complete text file download. If you need the new annotation file rather than that provide by Illumina, please contact Robert W. Williams. (Implemented by Xusheng Wang, Hongqiang Li, and Rob Wiliams.)
2007-07-27: The SGO Literature Correlation data generated by the Semantic Gene Organizer (SGO) team has been updated. SGO is a patented algorithm based on latent semantic analysis that provides a score of the terminological overlap between any two genes based on PubMed records and abstracts (M Berry, Kevin Heinrich, R Homayouni, and colleagues). These scores range from 0 to 1 (cosine similarity) and can be used like correlation coefficients, although all values are positive. GeneNetwork provides the Literature Correlations along with expression level correlations in most Correlation Table output pages. Roughly 5% of literature correlations have an r value greater than 0.6. The Literature Correlation covers roughly 75% of known genes in mouse, rat, and human.
The new 2007 dataset covers 21,903 genes and was derived from a corpus consisting of 242,365 MEDLINE abstracts. (The previous literature correlation dataset dated back to 2005 and was derived from a set of 108,367 abstracts covering 13,129 genes.)
We have also added a new SGO Literature Correlation feature to the Trait Data and Analysis Form. You can now select any gene and find the top genes (100 to 2000) with which it shares common terms and context based on the literature. To try this feature select one of the Calculate: SGO Literature Correlation options in the Trait Correlations part of the Trait Data and Analysis form. Normally GeneNetwork computes correlations based on expression and then provides the literature correlation as a secondary data type. This new feature reverses the situation; now the literature correlation is primary and the expression data are given as a secondary data type. This is a quick and powerful way to determine whether expression data support particular correlations extracted from the literature. (Implemented by Lijing Xu, Nick Furlotte, Ramin Homayouni.)
Legend: The Literature Correlation feature can be accessed from the Trait Correlation tool.
2007-07-26: Affymetrix Mouse Exon 1.0 ST data are now available for the first time in GeneNetwork for the striatum of 30 BXD strains and 20 standard inbred strains. These data were generated with the support of the High Q Foundation. (Implemented by Hongqiang Li, Arthur Centeno, Manjunatha Jagalur.)
Legend: Access to the Mouse Exon 1.0 ST Affymetrix expression data set from the High Q Foundation.
2007-06-21: A European GeneNetwork mirror site http://genesys.helmholtz-hzi.de has been established at The Helmholtz Zentrum für Infektionsforschung in Braunschweig, Germany. A GeNeSys private site will operate as part of the German Network for Systems Genetics (GeNeSys). (Implemented by Evan G. Williams and Klaus Schughart, Experimental Mouse Genetics.)
2007-04-26: Zhaohui Sun, one of our lead programmers is moving to Dallas with his family. As a quick review of these recent News items will show, Zhaohui has made many important contributions to GeneNetwork over the past year. Thanks Zhaohui. We will miss you.
2007-04-26: The SNP Browser that is integrated into GN has been further improved. It is now possible to filter SNPs by domain (exon, intron, etc.) and by function (e.g., mis-sense, silent). The updated version also includes a wider variety of strains of mice. (Implemented by Zhaohui Sun.)
2007-04-25: GeneNetwork's Scriptable Interface is now being extended at the request of Dr. Graeme Wistow of the National Eye Institute NEIBank so that external users and bioinformatics site developers can directly access the "best" data for particular genes and transcripts from specific data sets.
For example, it is now possible to link directly to the Basic Statistics data page for the expression of the rhodopsin gene in the eyes of BXD mouse strains using a string that has this form:
"http://www.genenetwork.org/webqtl/main.py?FormID=showBest&gene=Rho&database=Eye_M2_0906_R" (no line break)
where "gene=Rho" can be replaced by "refseq=NM_145383" to retrieve data using the RefSeq identifier (do not use the decimal point and digit that may follow, such as "refseq=NM_145383.1"), or can be replaced by "geneid=212541" to retrieve data using the Entrez Gene identifier.
Although not recommended, the string "&searchAlias=1" can be added at the end of the command to retrieve data using the alias of a proper gene symbol when the original does not work. Thus "gene=RP4" will resolve to "gene=Rho" only if you add "&searchAlias=1" at the end of the command.
When a particular gene is associated with more than one probe or probe set, our code selects the single probe or probe set with the highest average expression in that particular data set. For example, there are four probe sets that traget different parts of rhodopsin mRNAs. Using the Affymetrix M430 2.0 array, probe set 1425172_at has the highest expression in the eyes of BXD strains (the probes target the last two exons and the proximal 3' UTR).
To choose a particular data set one needs to know the appropriate code, such as Eye-M2_0906_R in the example above. These codes can currently be obtained from:
This implementation will be moved from our beta site to the production GN site in early May, 2007. (Implemented by Hongqiang Li.)
2007-03-23: We have annotated Illumina's Sentrix Mouse-6 1.0 microarray BeadArray platform (see Mouse LXS Hippocampus data sets). We have added or corrected gene assignments, symbols, and gene descriptions for almost all of the 46,166 Illumina probes. We added many data types not provided by Illumina and the MEEBO consortium in their original annotation files (http://www.microarray.org/data/download/MEEBO_Data.txt), including updated Entrez Gene IDs, gene ontology categories, human orthologs, OMIM identifiers, and the precise locations of the 50-mer probe sequences on the most recent mouse genome assembly (Feb 2006, mm8). Some helpful metrics on this annotation: For 46166 probes on the Mouse 6 array platform (including control probes) we have identified 36687 NCBI Entrez Gene IDs; 26180 matched human Gene IDs; 23899 matched rat Gene IDs; 26882 NCBI HomoloGene IDs; and 12790 OMIM IDs. Position data for the 50-mer Illumina Mouse-6 array were BLAT aligned to the latest mouse genome assembly by Hongqiang Li. Many of the probes and alignments have been error-checked to a limited extent by RWW. Annotation is still continuing and we will be adding new data over the next several months. (Curation by Robert W. Williams, implemented into GeneNetwork by Hongqiang Li.)
2007-03-02: The new production site of Genenetwork has been built from the code base maintained in Subversion. Subversion has completely been incorporated to our software development practice. Previously we have built a development / demo site - http://web2qtl.utmem.edu from Subversion, which will be our official beta site from now on. The previous beta site has been merged into our current production (main) site. The users can still access the old main and the old beta site through the archive: http://www.genenetwork.org:82/search.html. This archive site still uses the position data from the older assembly of the mouse genome (UCSC mm6) so it allows the users to retrieve old results. (Implemented by Zhaohui Sun.)
2007-02-21: The SNP Browser that is integrated into GN has been updated to include all Perlegen/NIEHS SNPs, all Celera SNPs, Wellcome Trust-CTC SNPs, and the Mouse Haplotype Map SNPs. We will continue to improve the annotation and strain of coverage of these SNPs, but you will find that you can already download +/- 100 nt around the SNP or automatically BLAT SNPs to the UCSC Genome Browser. It is also now possible to search for SNPs by some of their identifiers. We have intentionally left in duplicate SNPs to allow ou to evaluate data consistency. The "Gap" value will be 0 for redundant SNPs. (Implemented by Zhaohui Sun.)
2006-11-28: We finished converting all mouse genomic and genetic data sets to the latest mouse genome sequence assembly (mm8). Prior to this day all mouse sequence data has used position data from the March 2005 mm6 assembly (equivalent to NCBI Build 34). Currently we are using all position data of mm8, equivalent to NCBI Mouse Build 36 and Ensembl Mus musculus version 40.36a, released in February 2006. (Implemented by Zhaohui Sun, Evan Williams and Rob Williams.)
2006-10-30: LXS Hippocampus Illumina Mouse-6 Sentrix array data (77 strains of mice represented by male and female pools, 43,514 probe sequences) have been uploaded into Beta site for testing. (Implemented by Lu Lu and Hongqiang Li.)
2006-09-16: The code base for GeneNetwork, WebQTL, and many other code modules used by GN have been successfully moved into a source code revision management system called Subversion. This will greatly improve the GN software team's ability to maintain the code, develop new applications, and to collaborate with other research groups to expand GN. This work provides a solid foundation for the scalability and portability of genenetwork. As part of this effort, we have set up a new beta site that is currently the official "product" of the Subversion tree at http://web2qtl.utmem.edu. (Implemented by Zhaohui Sun and Hongqiang Li.)
2006-08-21: GeneNetwork TWiki Code and Hardward documentation site has been set up at
http://webqtl.utmem.edu:81/bin/view/GeneNetwork/WebHome. (Implemented by Stephen Pitts and Zhaohui Sun, with help from Bill Bug, and Hongqiang Li.)
2006-08-20: Members of the Kidney Consortium have contributed a very large gene expression data for 68 strains of mice (adults of both sexes), including 53 BXD strains to GeneNetwork. The August 2006 and July 2006 data sets are preliminary and will be subject to change. Due to strong sex differences and imperfect representation of all strains by arrays from both sexes, we recommend using the sex-corrected data sets. The Kidney Consortium is lead by Erwin Bottinger at Mt. Sinai School of Medicine. All array data we processed by Kremena Star (MSSM), and all samples were generated by Lu Lu and colleagues at UTHSC. Data have been normalized by Hongqiang Li and R Williams and Kremena Star. (Implemented by Kremena Star, Lu Lu, Hongqiang Li, Russell Chesney, Robert Williams, and Erwin Bottinger.)
2006-08-16: We soon will convert all mouse genomic and genetic data sets to the latest mouse genome sequence assembly. Through August 2006 all mouse sequence data has used position data from the March 2005 mm6 assembly (equivalent to NCBI Build 34). Starting September 2006 we will convert all position data to mm8, equivalent to NCBI Mouse Build 36 and Ensembl Mus musculus version 40.36a, released in February 2006. (Implemented by Evan Williams and Rob Williams.)
2006-07-12: Barley gene expression data and new SNP genotypes are being integrated into the GeneNetwork beta site for testing. All data are from Arnis Druka at the Scottish Crop Research Institute. Data are currently password protected, and those interested in obtaining access should contact Arnis.Druka@scri.ac.uk (Implemented by Jintao Wang, Arnis Druka, and Rob Williams.)
Legend: Access to the new Barley Affymetrix expression data set from the Scottish Crop Research Institute.
2006-07-11: Jintao Wang, the lead programmer for GeneNetwork, WebQTL, GenomeGraph, and GeneWiki has accepted a position at Federal Express. We are really sorry to see Jintao leave later in July, but wish him all the best at this exciting new position in one of the world's greatest companies. (Implemented by Jintao Wang ;-)
2006-07-10: The Mouse Phenome Database (MPD) and several other large data sets are being integrated into the GeneNetwork's Mouse Diversity Panel. To access these new data sets please select "MOUSE-GROUP-Mouse Diversity Panel". The Mouse Diversity Panel will eventually includes the MPD, additional strain data sets extracted from the published literature, the Wellcome Trust-CTC SNP collection, and several large gene expression data sets, including those for whole brain, hippocampus, cerebellum, and eye. (Implemented by Jintao Wang and Evan G. Williams.)
Legend: Access to the new Mouse Diversity Panel data sets.
Legend: Bar chart of white blood cell counts across 43 strains of mice taken from the Mouse Diversity Panel. Virutally all of the phenotype data are provided from the Mouse Phenome Project.
2006-06-23: The GeneWiki annotation and open note making system has been upgraded and now has an independent search page (see the Search menu, pop-down). We expect to make additional changes to the interface over the next few weeks. (Implemented by Jintao Wang.)
2006-06-23: Three new versions of the Hippocampus Consortium expression data set for adult BXD strains (June06 in MAS5, RMA, and PDNN versions). These data sets exclude several marginal arrays and correct for one incorrectly labeled strain in older data sets. (Implemented by Hongqiang Li.)
2006-06-12: GenomeGraph is being rewritten to exploit a Scalable Vector Graphics (SVG) interface that allows zooming and other advanced GUI features. Visit the beta version of GenomeGraph for a test drive. You will need an SVG plugin for your machine. We do not know of an effective universal SVG plug-in for Macintosh Intel machines. Therefore, if you have a Macintosh with an Intel processor you will currently need to force Safari to open using the Rosetta emulation mode. It is easy to do this. Just follow these directions. (Implemented by Jintao Wang.) Running Safari using Rosetta will slow things down a bit, so consider this a temporary solution. (Implemented by Jintao Wang.)
2006-06-09: Powerful new search method (RIF=your-text-here) that exploits the new Gene "Reference into Function" (GeneRIF) taken from NCBI. A search string such as rif=autism or RIF=Autism will find all genes/transcripts/proteins that are known to be associated with autism based on the GeneRIF entries from NCBI (n = 125 hits using one of the Affymetrix M430 expression data sets). (Implemented by Jintao Wang.)
2006-06-08: GeneWiki has been upgraded to include the current set of NCBI GeneRIF entries. These GeneRIFs provide a summary of information about genes. You can search of data in the GeneRIFs using this simple command in the ANY or ALL fields "RIF=text_string" or "rif=text_string". For example: RIF=schizophrenia will generate a list of all genes with schizophrenia listed anywhere in their list of GeneRifs. We encourage all users to enter their own comments and notes in the GeneWiki to supplement and extend the GeneRIF. (Implemented by Jintao Wang.)
2006-06-05: A new assembly of the mouse genome (mm8) is now being integrated into GeneNetwork databases and the web site. Please note that many data sets still rely on the mm6 assembly. (Implemented by Jintao Wang.)
2006-05-18: We are experimenting on the Beta site with Scalable Vector Graphics (SVG) displays of scatter plots generated from the Correlations Results tables. SVG allows you to modify the display size and the area of graphs using control clicks. You will need an SVG plug-in for your browser and hardware. SVG works fine with most Intel and Macintosh computers. However, if you have a Macintosh with an Intel processor you will find that the SVG version of the GenomeGraph does not work unless you force Safari to open using Rosetta. It is easy to do this. Just follow these directions. (Implemented by Jintao Wang.) Running Safari using Rosetta will slow it down somewhat, so consider this a temporary solution.
2006-05-11: New and final "Eye M430v2 (Apr06) RMA" database has been added to GN beta and production web site. This data set includes data for 71 strains including 55 BXD strains, C57BL/6J, DBA/2J, reciprocal F1 hybrids and 12 other strains of mice. The Info file is still incomplete. Data generated by Weikuan Gu, Eldon Geisert, Yan Jian, Lu Lu, and Rob Williams with support from Barrett Haik and the Hamilton Eye Institute. (Implemented by Hongqiang Li, Yanhua Qu, and Jintao Wang.)
2006-04-26: New and final Mouse BXD Eye mRNA expression database is being added to the Beta site using a new quality control procedure. The data are still being error corrected as of April 29, 2006. This data set includes 57 BXD strains, C57BL/6J, DBA/2J, F1 hybrids and 12 other strains of mice. Data generated by Weikuan Gu, Eldon Geisert, Yan Jian, Lu Lu, and Rob Williams with support from Barrett Haik and the Hamilton Eye Institute. (Implemented by Hongqiang Li, Yanhua Qu, and Jintao Wang.)
2006-04-26: Search menus are now being updated so that they provide a complete list of available databases in hierarchical pull-down menus. (Implemented by Jintao Wang.)
2006-04-18: We have converted our python code to utilize Mod_python. Mod_python is an Apache code module that embeds a Python interpreter within the server and that will often run many times faster than a traditional Common Gateway Interface (CGI). Mod_python will not help much for those processes (e.g., interval mapping or correlation tables) that take a long time to compute. But for fast processes, such as generating AJAX menus, opening data-editing page, it helps substantially. (Implemented by Jintao Wang.)
2006-03-31: A Correlation Results Tables now includes a feature to add multiple columns of correlations. This makes it possible to quickly identify well and poorly conserved correlations across data sets and tissues. You may need to use a newer browser to exploit this new feature. (See News item of March 14th; Implemented by Jintao Wang.)
2006-03-16: NCBI Entrez Gene LinkOut established. LinkOut is a service of NCBI and Entrez that allows you to link directly from PubMed and other Entrez databases to a wide range of information and services beyond the Entrez system. NCBI pages now link from mouse and rat genes to GeneNetwork expression data sets. (Implemented by Hongqiang Li.)
2006-03-14: GENSAT BGEM link to GN established. The Brain Gene Expression Map is a large library of in situ gene expression images of the embyronic, neonatal, and adult mouse. It includes data for over 3000 genes. (Implemented by Tom Curran and the BGEM group at St Jude Children's Research Hospital.)
2006-03-15: A Correlation Results Tables are now implemented using AJAX code that allows rapid resorting of the top 100, 200, or 500 traits. You will now see small UP and DOWN sort arrows in the column heads. You may need to use a newer browser to exploit this new feature. Being able to resort tables is useful when you would like to filter a list of traits by expression value (usually from high to low) or by position. AJAX is a programming method that makes web pages more responsive and dynamic. (Implemented by Jintao Wang.)
2006-01-20: GeneNetwork's MySQL relational database has been moved to a dual dual-core AMD Opteron 280 computer system assembled by Monarch Computer for improved performance. This system has halved the time required to compute correlation tables from about 100 seconds down to 40 seconds. (Implemented by Jintao Wang.)
2005-12-19: A short Review of GeneNetwork by William R. Lariviere on the American Pain Society web site.
2006-01-03: A GeneWiki system is being implemented. GeneWiki (also known as Gene Notes) allows any user of GN to add notes to the GN database. You can add annotations for genes of interest. All annotation is public. For example, RWW has added annotations on expression patterns of genes in different brain regions using taken from the Allen Brain Atlas and GENSAT. Our first GeneWiki implementation does not conform to all WIKI standards, and it may be more appropriate to consider GeneWiki as a simple system for adding notes on genes. We hope to load GeneWiki with many of the NCBI GeneRifs. (Implemented in progress by Jintao Wang.)
2005-12-19: An AJAX implementation of the Search Page is now being tested on the beta site. There should be almost no noticeable difference if you are using a current version of common web browsers (Explorer, Firefox, Safari). Please contact us if you have any problems. (Implemented by Jintao Wang.)
2005-12-15: GeneWiki feature added to GeneNetwork. You can add short annoations to the GN database that related to genes using an interface we have borrowed from the NCBI Gene Reference into Function (GeneRIF). To read all annotations provided by all users please click on the Annotations button (or GeneRIF button). All annotations are open and public. Annotations should ideally be of use to the research community. Here is an example of a recent annotation entered for the mouse Etv1 gene: "Amygdala and hippocampal CA1 and subiculum expression signature, highly specific neocortical layer 5 expression signature, cerebellar granule cell expression signature (data from Allen Brain Atlas, ABA)."
When adding a note, if possible please provide a PubMed ID number or a web address (URL). You can use the Annotations feature to find groups of genes that belong to interesting functional categories. We are currently using this feature to define sets of "expression signatures" for different parts of the mouse brain, for example genes and transcripts with highly selective expression in the dentate gyrus of the hippocampal formation. (Implemented by Rob Williams and Jintao Wang.)
2005-12-02: New Advanced Search function now allows users to search for either cis-acting or trans-acting QTLs across entire expression data sets. The general fomat is "TransLRS=(Low_LRS_limit, High_LRS_limit, Mb_buffer)". This syntax can be combined in the ALL field with other conditions, such as the chromosome location of the QTL and the expression level of the trait. For a better explanation please see the Advanced Search page. (Implemented by Jintao Wang.)
2005-11-21: Demonstration XML Schema for mouse data sets has been published for the use of the Biomedical Informatics Research Network (BIRN). For readability, please review the source code version of this page. This is an initial demonstration/proof-of-principle. (Implemented by Hongqiang Li.)
2005-11-15: Basic Statistics pages have been improved to handle larger data sets and to provide better graphic output. (Implemented by Jintao Wang and Rob Williams)
2005-11-14: Literature Correlations gene data set by Ramin Homayouni, Michael Berry and colleagues has been updated. The literature correlations are positive values between o and 1 that summarize the pair-wise similarity of genes (or transcripts) on the basis of the known literature using the methods described on the Semantic Gene Organizer site. (Implemented by Ramin Homayouni, Lai Wei, Kevin Heinrich, and Jintao Wang.)
2005-11-01: New Affymetrix M430v2 Eye Data Set for 63 strains of mice (C57BL/6J, DBA/2J, their reciprocal F1 hybrids, 47 BXD recombinant inbred strains, and 12 diverse inbred strains) have been entered on the beta site by the UTHSC Hamilton Eye Institute. Expression data for whole eye is available from Species = Mouse, Group = BXD, and Type = Eeye. The Information (INFO) file that accompanies this M430 data set is still provisional. Use of these data in publications is currently limited to members of the HEIMED consortium pending addition of more data, publication, and formal release, but if you would like permission to make selected use of data please contact Robert W. Williams, UTHSC. (Implemented by Lu Lu, Yan Jiao, Yanhua Qu with support of the Hamilton Eye Institute.)
2005-10-24: New Affymetrix M430v2 Hippocampus Data Set for 96 strains of mice (65 BXD, 13 CXB, and 16 diverse inbred strains, B6D2F1 and D2B6F1) will be placed on the beta site by the Hippocampus Array Consortium at the end of October. Expression data for whole hippocampus will be available from Species = Mouse, Group = BXD, and Type = Hippocampus. The Information (INFO) file that accompanies this M430 data set is still provisional. Use of these data in publications is currently limited to members of the consortium pending data addition, publication, and formal release, but if you would like permission to make selected use of data please contact Robert W. Williams, UTHSC. (Implemented by Lu Lu, Shirlean Goodwin, Yanhua Qu, Rob Williams, and members of the Hippocampus Consortium.)
2005-10-10: New Affymetrix M430v2 Striatum Data Set for a B6D2F2 Intercross has been placed on the beta test site by Robert Hitzemann and colleagues. Expression data for the striatum of 30 males and 30 females are available from Species = Mouse and Group = BDF2-2005. The Information (INFO) file that accompanies the M430 data is still provisional. For use of these unpublished data please contact Robert Hitzemann, Department of Behavioral Neuroscience, Oregon Health & Science University. (Implemented by Yanhua Qu.)
2005-10-07: Advanced Search options have been improved. The main improvement involves combining Gene Ontology searches with other advanced search syntax. (Implemented by Hongqiang Li.)
2005-09-28: GeneNetwork Mouse SNP Browser has been upgraded with Perlegen/NIEHS data. The SNP Browser is a tool that is used in combination with the Interval Analyst to evaluate and rank genes and polymorphisms in intervals thought to be responsible for variation in traits. The SNP Browser includes all Celera Genomics mouse SNPs, all public mouse SNPs in dbSNP (as of August 2005), and all Perlegen-NIEHS SNPs (http://mouse.perlegen.com/mouse/download.html as of late Sept 26, 2005). We thank Paul Thomas and Richard Mural of Celera Genomics, Gary Churchill and Natalie Blade of the Jackson Laboratory, and the Perlegen/NIEHS sequencing consortium for help and access to data. (Implemented by Robert Crowell, Alex Williams, and Jintao Wang.)
An example: To search for SNPs type in this string and then modify position as desired:
2005-09-27: Gene Ontology searching is now possible. This search feature allows you to search for all genes/transcripts related to particular categories using the appropriate GO identifer. For example, to extract all transcripts associated with "synaptic vesicle exocytosis" enter the string "GO:0016079" in the ANY field. To browse GO terms and classes link to AmiGo. As of Sept 2005, the GO contains approximately 20,000 terms of which approximately 6300 GO terms can be associated with genes in one or more of the GeneNetwork databases. Approximately 700 high level GO terms will return well over 200 genes. Given the 500 transcript limit it is therefore useful to select lower level GO terms that will return 100 or fewer probe sets/transcripts/genes. (Implemented by Hongqiang Li.)
2005-09-20: The UCSC Gene Browser is now linked to GeneNetwork from the Gene Description and Page Index as a "Quick Link" for both mouse and rat genomes. (Implemented by Jintao Wang at UT and Fan Hsu at UCSC.)
2005-09-06: Phenotype Data Entry SOP. We are beginning to develop standard operating procedures (SOP) to allow colleagues to deposit new data sets into the GeneNetwork. Please review this initial Phenotype data entry SOP if you have traits that you would like added to either an existing or new mapping panel (Partially implemented by Rob Williams.)
2005-08-26: OHSU/VA B6D2F2 Brain mRNA 430 (Aug05) MAS5, RMA and PDNN array data sets now are available. These data sets include M430 Set A and Set B arrays (Implemented by Yanhua Qu.)
2005-08-19: GenomeGraph has been implemented for several large array data sets and can now be used for testing purposes. GenomeGraph is a new module of The GeneNetwork that is designed for the analysis of entire array data sets. (Implemented by Jintao Wang.)
2005-08-17: Dynamic GeneNetwork Database Schema Description allows database experts to review the data structure and fields used by the GeneNetwork MySQL relational database. We have just begun the textual annotation of the database tables and field. This new system will soon replace the current database "dump" available at http://www.genenetwork.org/schema.html (Implemented by Hongqiang Li.)
2005-08-16: Traits in the Selections Windows Now Sortable. The Selection command is used to move trait data from one or more databases into a single Selections window (aka the "shopping cart") for common analysis. For example, users can put classical phenotypes such as body and brain weight in the same Selections window with transcripts for growth hormone receptor (Ghr), GH releasing hormone (Ghrh), and GHRH receptor (Ghrhr) in liver and brain. The new feature makes it possible to sort items in the Selections window by database, position, or name. Sorting is helpful is reviewing contents of the window and in reordering items prior to calculating correlation matrices. Please recall that all itmes in a Selections window must come from a single genetic reference population or panel, for example the AXB/BXA strains of mice, the BXH strains of rat, or from one of several intercrosses. (Implemented by Jintao Wang.)
2005-08-12: New Mouse Liver and Metabolic Trait Databases have been released by Dr. Alan Attie and colleagues. While these data may be reviewed, their use is still are reserved until final publication. The primary database is an Affymetrix M430 survey of gene expression in the liver of 60 selected F2 mice (a B6 x BTBR F2-ob/ob cross) that includes data on approximately 45,000 probe sets. This array database is accompanied by 24 classical metabolic and blood chemistry traits. All F2 animals were genotyped a 194 microsatellite markers. (Implemented by Alan Attie and colleagues, Yanhua Qu, and Jintao Wang.)
2005-08-08: The Interval Analyst (IA) provides a tabular summary of known genes in a chromosomal interval with data on gene expression, gene size, SNPs number and density, and human homologs. The IA is still a beta site function but will be release to the public site in the next week. The IA table is automatically generated with each chromosome map. IA tables can be extensively customized and resorted. For the BXD and AXB/BXA mouse genetic reference panels, the IA also provides access to Celera SNPs, as well as public SNPs for a variety of sources. Clicking on the SNP number for a specific gene in the IA generates a SNP browser table (at present, only for mouse). The purpose of the IA is to allow users to rank-order genes in an interval that may be contributing to variability in phenotypes. (Implemented by Evan Williams, Robert Crowell, Alex Williams, and Rob Williams.)
2005-08-08: The design of Chromosome and Whole Genome QTL Maps has been signficantly improved and updated. These new physical QTL maps merge LRS or LOD functions with gene and SNP tracks and can be zoomed to the level of single genes and SNPs. Maps can be exported in 2X versions that are near publication quality. Below most maps you will now find a customizable Interval Analyst table that can be customized to help rank order candidate genes. Variants of these new maps have been introduced to handle all species and genetic reference populations. (Implemented by Robert Crowell, Alex Williams, Evan Williams, and Rob Williams; final integration by Jintao Wang.)
Legend: Sample of a new high resolution physical map. This map shows a locus that modulates the expression of the Cart transcript (cocaine and amphetamine regulated transcript) on distal Chr 10 in BXD mouse strains (brain tissue). The Control Block, top middle, permits users to customize the display and its resolution. Pink, blue, and beige horizontal bars above the map provide links to higher resolution maps (8x) or to the UCSC and ENSEMBL genome browsers. Statistical thresholds for linkage are marked by grey and pink horizontal lines and are based on 2,000 permutations. The Y-axis provides a scale for the plot of LRS or LOD scores that are plotted using a thicker blue line. The calculation of linkage statistics are based on a total of 147 useful markers that have been genotyped in all 89 BXD strains (The Wellcome-CTC Mouse Strain SNP database with added microsatellite markers). The far more digital look of the LRS function that traditional interval maps arises for the simple reason that locations of recombinations in this cross have been precisely defined and only a fewer regions exploit a true interval mapping approach (see News item of 2005-06-17 for additional detail).
The thinner red and green lines and the right Y-axis display the additive effect size; green for high alleles inherited from one parent (DBA/2J in this example), and red for high alleles from the other parent (C57BL/6J). The units are log2 expression differences where 0.2 is equivalent to a 2^0.2-fold difference. The large number of closely packed tick marks along the top of the map show locations of genes on Chr 10. Gene blocks are color coded by the average density of SNPs per gene using a rainbow color sequence with low density in the blue/green spectrum and high density in orange/red spectrum. The bright orange hash marks along the X-axis provide a graphic estimate of numbers of SNPs that are segregating in the BXD strains in any particular chromosomal region. A long interval from 30 Mb to 65 Mb is almost identical by descent between the two parental strains.
Many regions of these maps are responsive to a mouse click. For example, the name and size of any gene can be determined by simply placing the mouse cursor over its mark. The same applies to the significance thresholds and the SNP track. Below each of these maps is a complete list of known genes in the interval with numerous links to other data types, including information on expression, lists of known SNPs in each gene, and corresponding regions of the human genome. All physical map positions in mouse are based on the Mouse Build 34, mm6 (March 2005).
2005-08-02: An Export Traits function button has been added to the set of tools available in each Selections Window (the Selections window is known informally as the "shopping cart"). Export Traits now joins other tools such as Cluster Tree, Network Graph, and Compare Correlates at both the top and bottom or each Selection window. Any set of traits in the Selection window can be easily exported, including conventional phenotypes, genotypes, and subsets of array data. The default output format is compatible with Microsoft Excel. (Implement by Jintao Wang.)
2005-07-29: Rat RAE 230A and Mouse (M430 and U74A) Affymetrix Probe Set Annotation Tables have been significantly improved and realigned to rat and mouse genome assemblies. Information taken from the BLAT alignment data has been added to GeneNetwork data tables. Data types include the alignment score of concatenated probes, probe set specificty (usually the ratio of first hit score divided by second hit score), a position values of the 3' and 5' ends of the concatenated probe sequences. [Implemented by Senhua Yu (rat) and Yanhua Qu (mouse).]
2005-07-27: All Mouse Genotype Databases have now been fully updated using Wellcome-Illumina-CTC SNP data sets consisting of 13377 SNPs. These SNPs have been integrated with the older microsatellite markers used through July 2005. You can search for markers (see Advanced Search) and treat genotypes as a standard "trait." You can also align the sequence of any marker to the latest genome assembly to determine where a SNP or microsatellite is located. (Implemented by Jing Gu, Lu Lu, and Jintao Wang.)
2005-07-26: Complete Upgrade of the PUBLISHED PHENOTYPE Databases. All PubMed abstracts were searched in June and July of 2005 for publications pertaining to BXD, AXB, CXB, or BXH mouse recombinant inbred strains. Means and standard errors were collected, reviewed, and extracted from these papers. Data were then entered manually in GeneNetwork tables by Emily English and Elissa Chesler.
2005-07-26: Sorting Traits by several different variables is now possible in the Search Results page. Select from seven different ways to sort lists as shown in the screen shot below.(Implemented by Jintao Wang)
2005-07-26: QTL Reaper tutorial has been added to the GeneNetwork site. QTL Reaper is a command line program for high throughput mapping of array data sets. (Implemented by Evan Williams.)
2005-07-25: An Error Detected and Corrected in SJUT Cerebellum databases dated March 2005. Data for BXD23 mistakenly included a BXD14 sample. All three March 2005 databases (RMA, PDNN, MAS5) have now been corrected. Values for the two affected strains are changed relative to data in this database prior to July 25, 2005. (Implemented by Jing Gu, Rob Williams, and Yanhua Qu.)
2005-07-22: Modified Linux Virtual Server configuration to eliminate problems with client institution firewall restrictions on numbers of simultaneous connections. Our thanks to Dr. Michael Miles for his help diagnosing firewall problems for clients. (Implemented by Jintao Wang.)
2005-07-21: Improved Advanced Search. It is now possible to combine search strings to generate complex queries. For example, this combination Mb=(Chr11 90 100) Mean=(12 20) when entered into the lower ALL field will find transcripts that map to Chr 11 between 90 and 100 Mb that also have mean expression between 12 and 20 units. (Implemented by Jintao Wang.)
2005-07-15: . GeneNetwork Mouse SNP Browser has been implemented. The SNP Browser is a tool that will eventually be used in combination with the Interval Analyst to evaluate and rank genes and polymorphisms in intervals thought to be responsible for variation in traits. The SNP Browser includes all Celera Genomics mouse SNPs, all public mouse SNPs in dbSNP (as of August 2005), and all Perlegen-NIEHS SNPs (http://mouse.perlegen.com/mouse/download.html as of late June 2005). The SNP Browser is still at an early stage of development. We thank Paul Thomas and Richard Mural of Celera Genomics, Gary Churchill and Natalie Blade of the Jackson Laboratory, and the Perlegen/NIEHS sequencing consortium for help and access to data. (Implemented by Robert Crowell, Alex Williams, and Jintao Wang.)
An example: To search for SNPs on Chr 5 from X to Y Mb:
2005-07-15: Access to GeneNetwork Archive site. The archive site provides access to old data sets and old genotype files that have now been superceded. We anticipate that it will be used mostly to verify old findings and to document changes in results. The Archive is now available from the main search page. (Implemented by Jintao Wang.)
2005-07-13: LXS Genotypes Upgraded. Genotypes for the large GRP of LXS strains has been greatly improved thanks to the Illumina-Wellcome-CTC SNP project. The original set of 330 markers has been replaced with a set of 2659 informative markers. Download either the LXS genotypes or BXD genotypes used by WebQTL as text files.
(Implemented by Jing Gu and Jintao Wang.)
2005-07-12: Search Page Upgraded. Users now can change the default settings to those they most commonly use. Your browser must be configured to allow The GeneNetwork to retain a "cookie" on your computer. We have also added a new button labeled ADVANCED SEARCH that provides advice and syntax for searches. (Implemented by Jintao Wang.)
2005-07-12: Pair-Scan Upgraded. The pair scan now exploits the new Wellcome-Illumina high density genotype files. This result in more exhaustive searches for two-locus interactions. This is particulary true when single chromosome pairs are scanned by clicking on the initial DIRECT output graph. (Implemented by Jintao Wang.)
2005-07-12: Updated Affymetrix M430 GeneChip Annotation Data. We have realigned all M430 probes and probe set sequences onto the latest mouse assembly (Build 34 or mm6). This annotation is more complete than most other available M430 probe set annotation of which we are aware, including Affymetrix NetFX. (Implemented by Yanhua Qu.)
2005-06-17: New High Density Mapping Algorithm that exploits the Wellcome-CTC SNP data has been implemented for the BXD mouse genetic reference populations on both public and beta sites. In the case of the BXD panel (BXD1 through BXD100), the merged SNP and microsatellite maps are based on a total of 7636 informative markers that differ between the parental strains, C57BL/6J (B) and DBA/2J (D). The locations of these makers are known on the latest assembly of the mouse genome (Build 34, mm6). The median distance between markers in this subset is 178,831 bp. The mean distance is 324,493 bp. There are only 26 intervals between markers that are longer than 5 Mb. No interval is greater than 10 Mb except on Chr X. These long intervals are essentially monomorphic between the parental strains.
The new algorithm exploits a selected subset of 3795 markers that includes all markers with unique strain distribution patterns (SDP), as well as pairs of markers (the most proximal and most distal markers) for SDPs represented by two or more markers. This BXD genotype data set can be downloaded by ftp at ftp://atlas.utmem.edu/public/BXD_WebQTL_Genotypes_June05.txt.
The mapping algorithm is a mixture of simple marker regression, linear interpolation, and standard Haley-Knott interval mapping. If two adjacent markers have identical SDPs they will have identical linkage statistics, as will the entire interval between the markers (assuming complete and error-free haplotype data for all strains). On a physical map the LRS and the additive effect values will therefore be constant over this interval. Between neighboring markers that are separated by 1 cM or more we use a conventional interval mapping method (Haley-Knott) combined with a Haldane estimate of genetic distance. When the interval is less than 1 cM we simply interpolate linearly based on a physical scale between the markers. The result of this mixture mapping algorithm is a map of the trait that has an unusal profile that is particular striking on a physical (Mb) scale, with many plateaus, abrupt linear transitions between plateaus, and a few regions with the standard graceful curves typical of interval maps.
The same procedure will soon be implemented for other mouse GRPs, including AXB/BXA, CXB, BXH, and AKXD.
For users that would like reference access to the old set of genotypes, we will set up an Archive site with the May 2005 microsatellite markers and maps.
To download the combined SNP and microsatellite genotype file used in WebQTL please link to ftp://atlas.utmem.edu/public/ and look for Illumina_UT_BXD_May05.xls (entire data set) or BXD_WebQTL_Genotypes_June05.txt (extracted subset of markers used by WebQTL), or link to Dr. Richard Mott's Mouse Inbred Line Genotype site for the original SNP data set. (Implemented by RW Williams, KF Manly, and JT Wang.)
2005-06-13: Rat HXB Fat Data Set released on the www.genenetwork.org/search3.html test site (stabilized RMA transform). The Affymetrix RAE230A data files generated by Tim Aitman and colleagues were downloaded from the Array Express site. The set of 120+ arrays covers a total of 30 RI strains and complements a recent paper (Hübner et al., 2005). Error checking is still in progress and this is a pre-release data set to use for test purposes. (Implemented by Senhua Yu, R. Williams, and Jintao Wang. More transforms are in progress.)
2005-06-12: Moved GeneNetwork and Upgraded Utilities. The GeneNetwork and the WebQTL module has been moved to a cluster of nine P4 single processor computers. Eight of the nodes are devoted to the GeneNetwork application code while the ninth node runs the Linux virtual server. The MySQL database server currently runs on a separate Proliant dual processor node. The Roundup issue tracking systems has been upgraded to v. 0.83 and is now available at http://www.genenetwork.org:8080/webqtl/. Analog has also been upgraded to v 6.0. (Implemented by Jintao Wang, with thanks again to Ari Berman.)
2005-05-24: Ultra-high Resolution Mouse SNP Genetic Maps are now gradually replacing the previous generation of microsatellite maps. Until May 2005, all genetic maps of recombinant inbred strains of mice in WebQTL have relied heavily on a set of roughly 1500 microsatellite markers genotyped across all RI sets by the Informatics Center for Mouse Neurogenetics (Williams et al., 2001; Peirce, Lu et al, 2004). In collaboration with members of the CTC (Richard Mott, Jonathan Flint and colleagues), we have helped genotype a total of 480 strains using a panel of 13,377 SNPs. More than half of the SNPs are informative in most crosses. These SNPs have been combined with microsatellites to produce new consensus maps for BXD and other GRPs using the latest mouse genome assembly as a reference frame (Build 34 - mm6). In the case of the BXD GRP, a total of 88 strains were genotyped using the full set of SNPs of which 7482 are informative. The order of markers given in WebQTL is essentially the same as that given in Build 34. To reduce false positive errors when mapping using this ultradense map, we have eliminated most single genotypes that generate double-recombinant haplotypes. Double-recombinant haplotypes are most commonly produced by typing errors ("smoothed" genotypes). (Implemented by Lu Lu, Jing Gu, Jintao Wang, Ken Manly, and Rob Williams, with help from Jonathan Flint and Richard Mott).
2005-05-23: Search Functions have been upgraded. It is now possible to (1) find all transcripts whose genes map to a give chromosomal location; (2) all traits and transcripts that have a mean value within a particular range; (3) all traits that have a peak genome-wide linkage score (LRS score or p value) within a particular range. These new search functions are still being tested on the test site (http://www.genenetwork.org/search3.html). (Implemented by Jintao Wang).
(1) To find transcripts by chromosomal position the search syntax needs to follow these rules:
- "Position in (ChrY 0.3 52.4)" or "Position = (Chr1, 98 104)" [Note: No space between "Chr" and the number or letter of the chromosome. ]
- "Pos in (ChrY 0.3 52.4)" or "Pos =(Chr1, 98 104)" [don't enter the quotes.]
- "Mb in (ChrY 0.3 52.4)" or "Mb = (Chr1, 98 104)" [don't enter the quotes.]
(2) To find traits by mean value, the search syntax needs to follow these rules:
- "Mean in (12.3, 12.4)" or Mean=(12.3, 12.4) [These strings will find those traits with a mean value from 12.3 and 12.4. Don't enter the quotes.]
(3) To find traits by LRS value or p value, the search syntax needs to follow these rules:
- "LRS in (20, 30)" or "LRS=(20, 30)" [These strings will find traits with LRS values ranging from 20 to 30. This search depends on the existence of database of precomputed LRS values. If this database has not yet been set up for a particular data set, then the search will not return any records. Don't enter the quotes.]
- "pvalue in (0.0001, 0.001)" or "pvalue=(0.0001, 0.001)" [These strings will find traits with p values ranging from 0.0001 to 0.001. This search depends on a database of precomputed values. If this database has not yet been set up for a particular data set, then the search will not return any records. Don't enter the quotes.]
2005-05-13: Virtual Server implementation of The GeneNetwork is being beta tested. The Linux Virtual Server (LVS) allows GeneNetwork to exploit a small clusters of servers to handle larger numbers of clients quicky. Performance is particularly critical during bioinformatics class projects when large numbers of students make nearly simultaneous requests. (Implemented by Jintao Wang, Senhua Yu, and Ari Berman).
2005-05-12: Genome Explorations Inc. has been provided a license to run a copy of the GeneNetwork and WebQTL software as part of a Phase I Small Business Innovation Research (SBIR) grant from NIAAA. The TCP/IP address is 22.214.171.124. The site currently contains three data sets (MAS5, RMA, and PDNN) generated at GE and UTHSC (subcontractor) using a total of 85 Affymetrix M430 2.0 arrays. The first data release consists of 26 BXD strains, the two parental strains, C57BL/6J and DBA/2J, and ten other inbred strains of mice (A/J, 129S1/SvJ, AKR/J, BALB/cJ, BALB/cByJ, C3H/HeJ, CAST/Ei, KK/HIJ, LG/J, and NOD/J). (Implemented by Jintao Wang, Yanhua Qu, Lu Lu, Roberrt Williams, Robert Rooney, and Divyen Patel).
2005-05-10: Whole Transcriptome Mapping Display: We are testing an interface that displays a entire transcriptome QTL map for a tissue similar to figures 3A and 3B of Chesler and colleagues (2005). Note that one parameter can be used to modify the false discovery rate of the points that are plotted. Plots have been precomputed for more than 30 databases and transforms. (Implemented by Jintao Wang).
2005-05-04: New Mouse Genome Assembly (NCBI Build 34, UCSC mm6) released by NCBI (implemented by Deanna Church and colleagues). Over the next several months all mouse genome megabase and nucleotide position data and links in the GeneNetwork (markers, probes, SNPs, genes) will be converted to this new assembly. BLAT searches initiated with WebQTL already exploit the most recent build. GeneNetwork users may find small discrepancies in gene and marker locations until all database tables are updated.
2005-04-22: Arabidopsis Data Sets released on the www.genenetwork.org/search3.html test site. The Genotypes and Phenotypes files for the Bay-0 x Shahdara cross data were all provided by Olivier Loudet. Please see the Information file. Implemented by O. Loudet, R. Williams, and Jintao Wang.
2005-04-21: Rat HXB Kidney Data Set released on the www.genenetwork.org/search3.html test site (original RMA transforms). The Affymetrix RAE230A data files were provided by Norbert Hübner and colleagues. The set of 120+ arrays covers a total of 30 RI strains and complements a recent paper (Hübner et al., 2005). Implemented by Senhua Yu, R. Williams, and Jintao Wang. More transforms are in progress (MAS5 added May 13, 2005).
2005-04-14: New S-Score Transform for the BXD Brain data set released on the www.genenetwork.org/search3.html test site. This data set complements existing MAS5, PDNN, RMA, dCHIP, and HWTIPM transforms. The Significance score method centers the expression of every probe set at 0. The signal values are therefore the strain deviations in Z score units from the grand mean based on 100 arrays. The S-score software is described in Zhang et al. (2002) and Kerns et al. (2003).
2005-04-08: Expanded HBP/Rosen Striatum Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). The new data set covers a total of 33 strains using 59 M430 2.0 arrays. A good demonstration of the improved performance of the expanded data set is Kcnj9 (probe set 1450712_at_A), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 27.0 in the initial November 2004 data set (MAS5) and a peak LRS of 47.8 in the April 2005 data set (MAS5). The peak LRS is approximately 600 Kb proximal to the Kcnj9 gene. The Heritability Weight Transform (HWT) data set will be added in the next several weeks.
2005-04-04: Expanded INIA Brain Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). Seventy-one new samples have been added, bringing the total to 105 arrays covering 42 BXD strains, both parents, and the F1 hybrid. A good demonstration of the improved performance of the expanded data set is Kcnj9 (probe set 1450712_at_A), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 14 in the initial October 2004 data set (MAS5) and a peak LRS of 41.9 in the April 2005 data set (MAS5). The peak LRS is approximately 2000 Kb distal to the Kcnj9 gene. We have also tested these data using probe set 1418908_at_A (Pam). This trait generates a peak LRS score of 52.8 in the initial October 2004 data (MAS5) and a peak LRS of 54.2 in the April 2005 (MAS5). The peak LRS is approximately 800 Kb distal to Pam gene. The Heritability Weight Transform (HWT) transform will be added in the next several weeks.
2005-03-21: Expanded Cerebellum Data Sets released on the www.genenetwork.org/search3.html test site (MAS5, RMA, and PDNN transforms). Fifty-four new samples have been added. We have tested these data using probe set 1418908_at_A (Pam), a known cis-QTL in multiple data sets. This trait generates a peak LRS score of 31.7 in the initial March 2003 data (MAS5), a peak LRS score of 32.3 in the October 04 data (MAS5), and a peak LRS of 52.2 in the March 2005 (MAS5). In the March 2005 data, the peak LRS is only 500 Kb from the 5' promoter region of the Pam gene. The abundantly expressed GABA alpha 6 receptor (Gabra6) transcript (1417121_at_A) is another good test case of a cis modulated trait in cerebellum. (Implemented by the GeneNetwork group and the Cerebellum Consortium). The Heritability Weight Transform (HWT) data set will be added in the next several weeks.
2005-03-15: Cluster Trees now compute and display up to 100 traits simultaneously. This makes it possible to select the top 100 covariates of a trait from a Correlation Results table and map all 100 as a hierarchically organized group. (Implementation by Jintao Wang).
2005-03-04: Literature Correlation data set has been integrated into GeneNetwork Correlation Results output tables. This important new feature provides an estimate of the strength of relations between pairs of genes that is based on a textual analysis of PubMed abstracts (latent semantic index correlations). Values are based on a matrix of 16,000 gene-gene simlarity scores computed by Ramin Homayouni (UTHSC) and Michael Berry (UT Knoxville). This feature is still experimental, and GeneNetwork users should note that pairs of genes that are mentioned together in a small set of papers may have inappropriately high correlations. For more information on the algorithm please contact Ramin Homayouni. (Implementation by Ramin Homayouni and Jintao Wang).
2005-03-01: Network Graph output has been improved significantly. It is now possible to change the labels from probe set IDs to gene symbols. Nodes can also be color-coded by database. Markers and genotypes can be used as nodes. Literature Correlations can be used to define the lines (edges) between traits. (Implementation by Jintao Wang).
2005-03-01: Heritability Weighted Transform method has been published at Genome Biology. This method (HWT1PM) provides significantly higher signal than other common transforms. (Design and implemenation by Ken Manly)
2005-02-23: Database Schema has been published online at http://www.genenetwork.org/schema.html. This schema (January 2005 version) was generated using MySQLdump v 9.1. (Implemenation by Jintao Wang, Bill Bug, and Ken Manly)
2005-02-23: Scriptable Interface improved to handle queries from Genome Browser and other systems. The new interface provides a list of links to data from multiple tissues and strains for a single gene. For example, to retrieve expression estimates for Kcnj8 the URL query has this form: http://www.genenetwork.org/cgi-bin/beta/main.py?cmd=search&gene=kcnj8. This query does not resolve the many possible aliases for gene symbols, and requires the use of the preferred or official gene symbol. (RWW, implementation by Jintao Wang)
2005-01-27: QTL Reaper 1.0.0 has been released. QTL Reaper is platform-independent program for rapidly mapping thousands of traits. It is now available to advanced users at SourceForge (241 KB, written in Python and C with sample and help files). QTL Reaper can map well over 50,000 traits in under 12 hours on fast single-processor systems. It includes a sophisticated method (Besage, 1991) to adjust the number of permutation tests to estimate genome-wide p values with reasonable precision down to values of approximately 10^-5 (10^6 permutations). This feature is useful for identifying reproducible QTLs in large transcriptome data sets, that is, sets of QTLs with defined false discovery rates. (Design by Ken Manly, implemenation by Jintao Wang)
Besag J, and Clifford P (1991). Sequential Monte Carlo p-values. Biometrika 78: 301-304.
2005-01-26: The Pair-scan output tables now include a new analytic tool that provides a breakdown of strains in each genotype category (for example, the four two-locus genotypes: B/B, B/D, D/B, and D/D) either in the form of scatter plots or in the form of a box plot. This new feature is still being tested and refined and is currently available only on the test site (www.genenetwork.org/search3.html). This feature will be moved to the public site in February. (Implemenation by Jintao Wang)
2005-01-22: Marker Genotype Databases have been added that complement trait and transcriptome databases for the following groups: AKXD, AXB/BXA, CXB, BXH, BXD, LXS, B6D2F2, and the rat HXB/BXH. These new databases enable you to use any marker genotype as a "trait" to search for transcripts or classical phenotypes that may be influenced by particular genomic regions. This is now possible using the new Genotype databases and the Compare Correlates tool. To find all markers on Chromosome 1 just type in "Chr 1" or "Chromosome 1" into the Search field. These maker genotype databases are currently available on the test site (www.genenetwork.org/search3.html) but will be moved to the public site by late January. (Implemenation by Jing Gu, Lu Lu, Yanhua Qu, Rob Williams, and Jintao Wang)
2005-01-21: New Data Download feature has been added. The Information files for most UTHSC Brain databases (e.g., the RMA Orig transform) now have links to Excel workbooks that include the full Affymetrix U74Av2 data set of 100 arrays for each transform. These Excel workbooks also include a separate spreadsheet with the strain averages for each transform. Look for the word "Download" in the Information pages. (Implemenation by Yanhua Qu)
2005-01-13: We have added a new BLAST probe analysis tool to the Probe Information tables associaed with each Affymetrix probe set. This button-tool aligns any PM 25-mer probe to the GenBank sequence that Affymetrix lists as being the sequence source. When BLAT analysis of concatenated probes does not provide an unequivocal map location for a probe set, this method can be used to verify that the GenBank accession is correct. If so, it may then be appropriate to BLAT the entire GenBank entry to verify probe set map location. (Implemenation by Yanhua Qu)
2005-01-11: Rat HXB/BXH Published Phenotype databases added to the GeneNetwork. The genetic maps that are used in combination with these phenotypes are based on a total of 770 markers. Phenotypes were all provided by Michal Pravenec. We thank Tim Aitman and Pierre Mormede for review of their data sets. (Implementation by RWW, MP, and JW)
2005-01-03: We now provide links to entire data files for the U74Av2 brain data set. All DAT, CEL, TXT, RPT, and EXP files can be downloaded. For example, here are data files for five C57BL/6J U74Av2 arrays. The complete U74Av2 data set consists of a total of 100 arrays, all of which can be reached from the Main Table in any of the Information Pages for these different transforms (MAS5, RMA, PDNN, HWT1PM, dChip). The DAT, CEL, RPT and EXP files will be identical among all transforms. The only differences among transforms are the TXT files. The appropriate reference to cite if you make use of these data files is:
Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin N, Langston MA, Threadgill DW, Manly KF, Williams RW (2005) Genetic dissection of gene expression reveals polygenic and pleiotropic networks modulating brain structure and function. Nature Genetics 37: 233-42.
2004-12-25: First draft of the WebQTL Glossary is completed. Many key terms are now defined. We will be adding links to the glossary from graphs and other pages.
2004-12-22: An annotated Links list has been added. Email RW Williams at if you have suggestions for additional sites that have proved useful in combination with GeneNetwork resources.
2004-12-21: We have implemented a new method of transforming Affymetrix microarray data called the Heritability Weighted Transform (Manly et al. 2005). When used with large Affymetrix data sets of the type used by WebQTL, this method is considerably more powerful than other common probes-to-probeset transform such as MAS5, PDNN, RMA, or dChip. To evaluate this new method please try the Mouse/BXD/Brain/Database called UTHSC Brain mRNA U74Av2 (Dec03) HWT1PM (HWT1PM is short for Heritability Weighted Transform Version 1, Perfect Match Probes only). For further detals on this method see the Info page. The reference for this approach to transforming Affymetrix array data is:
Manly KF, Wang J, Williams RW (2005) Weighting by heritability for detection of quantitative trait loci with microarray estimates of gene expression. Genome Biology 6: R27.
2004-12-17: We have added mouse UniGene identifiers from Build 142. It is therefore now possible to enter search terms such as "Mm.1" to find data on S100 calcium binding protein A10 (S100a10). A total of 38,034 probe sets on the Affymetrix mouse expression array 430 2.0, have UniGene identifiers.
2004-12-14: First draft of the WebQTL Frequently Asked Questions is completed. We be happy to answer any other questions you have. Please email RW Williams at .
2004-12-13: Major additions are expected later in December in both the SJUT Cerebellum data set and in the INIA Brain data set. Sample size will be almost doubled in both data sets.
2004-12-10: Updated positions of Mouse Expression Aglient G4121A probe using the May 2004 (mm5) assembly of the mouse genome. This work was carried out by Yanhua Qu.
2004-12-10: We have begun to combine WebQTL and The GeneNetwork. WebQTL is the first and so far only "channel" of the GeneNetwork. However, our hope is that there will soon be other projects that will share use of the GeneNetwork. The main URL is now www.genenetwork.org. Requests to www.webqtl.org will resolve to www.genenetwork.org.
2004-12-03: Rat HXB/BXH genotype and published phenotype databases added to beta test site of WebQTL. The genetic maps are based on a total of 770 markers. Phenotypes were all provided by Dr. Michal Pravenec.
2004-12-02: Important new graphic and analytic tools have been added.
The first of these is the Compare Correlates tool. This function is available in Selection Windows. It is essentially a Venn diagram set tool. Instead of providing simple graphs, it provides lists of traits in different parts of a virtual Venn diagram. For example, to find traits that covary with Sonic Hedgehog, Indian Hedgehog, Desert Hedgehog, Patched1, and Gli3, you would select five key transcripts into a Selections window (use the "Add Selection" tool and then select a group of traits in the Selections window). Compare Correlates allows you to chose the target database to which the key traits will be correlated. Compare Correlates was designed by Elissa Chesler and Stephen Pitts. Code was written and optimized by Stephen Pitts.
The second new tool is Network Graph. This function displays a set of traits and their correlations in the form of a graph with nodes (traits) and lines (correlations). There are quite a few tunable parameters, including the correlation threshold used to draw (or not draw) a line between nodes. To use this new tool, you again need to have traits loaded into one of the Selections windows. Network Graph was designed by Elissa Chesler and Stephen Pitts. Code was written, optimized, and error-checked by Stephen Pitts.
2004-10-24: Updated positions of all Mouse Expression U74Av2, 430A, 430B, and 430 2.0 probe sets using the May 2004 (mm5) assembly of the mouse genome. This work was carried out by Yanhua Qu. The M430 data consists of 45,000 probe sets. Positions were obtained using a series of methods: Method 1. A BLAT analysis of the actual probe sequence using a 48-processor cluster (our thanks to Yan Cui). Roughly 90% of all probe sets were mapped using this method. If the probe sequence did not BLAT with a score above 99 AND an identity match of 100, then we used Method 2: We used the position of the probe set given in the affMOE430.txt.gz data file. This method recovered position data for approximaely 5% of all probe sets. If Method 2 failed, then we used Method 3: We obtained the position given by Affymetrix in the files called "MOE430A Annotations, CSV (6.3 Mb, 10/12/04)" and "MOE430B Annotations, CSV (3.9 Mb, 10/12/04)". This method recovered positions on roughly 4%. As a last resort we used Method 4: We retained position data from mm4 or mm3 without interpolation. No position data would be found for 198 records and no chromosome could be found for 46 probe sets. We estimate that 5 to 10% of position data are unreliable.
2004-10-16: Expression data set for the striatum of BXD strains released by Glenn Rosen to the www.webqtl.org/search3.html beta site. This is the first WebQTL database that exploits the Mouse Expression 430 2.0 array from Affymetrix. Four versions were released: MAS5, RMA, PDNN, and the new GCRMA.
2004-10-11: New hierarchical Search Page interface released to main site (Choose species, cross, type, and database). New Info pages released. More complete annotation and explanation of the use of the pair-scan data is now provided when the "permutation" option is selected in the Analysis Tools area of the Trait Data and Editing Form.
2004-09-22: Pair-scan feature is now zoomable. Click on any single chromosome pair region to zoom in.
2004-08-20: Pair-scan permutation test is now available, it takes 90 seconds to do 500 permutations.
2004-07-15: New Pair-scan searches for pairs of chromosomal regions that may be involved in two-locus epistatic interactions is added to WebQTL
2004-06-07: Interval mapping graph in 2X resolution is now available for downloading.
2004-06-02: Three new B6D2F2 database are added to WebQTL. Dominance estimation for interval mapping with F2 data is available.
2004-05-03: Cluster qtl map display is added to WebQTL. These QTL heat maps can be drawn using three different color assignments.
2004-03-18: User is now able to add their own traits to selections, the correlation matrix and multiple mapping and some other features can be included for those traits.
2005-07-15: Database List Selector has been implemented for the administrator. This facility is used to select the best databases to use by external resources that link to the GeneNetwork. (Implemented by Jintao Wang.)
Information about this text file:
This text file originally generated by RWW, March 2004.