NYU Value-Added Galaxy Catalog

Note: Due to a security flag on the web server, the files listed here are not available. We hope to make them available again in a few days. 2017-08-08

Shells Corresponding authors:
Michael R. Blanton and David W. Hogg
Center for Cosmology and Particle Physics
Department of Physics
New York University

Additional authors: David Schlegel (LBL), Douglas Finkbeiner (Harvard University), Nikhil Padmanabhan (LBL), Max Tegmark (MIT), Idit Zehavi (Case Western), Andreas Berlind (Vanderbilt), Ryan Scranton (UPitt), Christy Tremonti (University of Arizona), Jeff Munn (USNO), Gillian Knapp (Princeton University), James Gunn (Princeton University)

Note any important updates.

General Information

The NYU Value-Added Galaxy Catalog (NYU-VAGC) is a cross-matched collection of galaxy catalogs maintained for the study of galaxy formation and evolution. It includes carefully constructed large-scale structure samples useful for calculating power spectra, correlation functions, etc.

This catalog is described in a paper published in the Astronomical Journal. Usage of the NYU-VAGC requires citing this paper (as well as the appropriate citations for component catalogs and calibrations).

DR7, the final public sample, consists of 10417 sq deg of and 7966 sq deg of spectroscopic coverage. (The spectroscopic coverage excludes SEGUE-2 data).

See important notes for updates on the DR6 and DR7 LSS samples.

See the FAQ for answers to some questions about the NYU-VAGC. There is also a discussion of the motivation behind the NYU-VAGC, as well as some of the science being done with it.

Currently, no funding exists for the NYU-VAGC and we create it for the warm, fuzzy feeling inside that it gives us when we think that people might be using it. In order to feed that warm, fuzzy feeling we greatly appreciate any feedback!

Conditions of use

If you download NYU-VAGC data, we would appreciate you:

Downloading the data

Here we describe how to get to the data, and below we describe how the data is organized.

The public data release data is available through a set of directories conforming to the data model described below. The base URLs are:

You can retrieve this set of data automatically using Unix WWW downloaders such as GNU wget or curl.

Organization of the data

All entries in the catalog are sources in the SDSS imaging survey, either because they pass the Main sample target criteria (in some version) or had an SDSS spectrum close to their center (details are in the object_sdss_imaging file description).

We match all of the objects to the SDSS spectroscopic survey, to the FIRST radio survey, to the 2MASS Point Source Catalog, to the 2MASS extended source catalog, to the Two-degree Field Galaxy Redshift Survey, to the IRAS Point Source Catalog Redshift Survey, and to Reference Catalog 3 (RC3.9b).

We present all the data in the form of FITS binary tables, FTCL parameter files, and mangle-style polygon files. idlutils contains tools for reading all of these files into IDL structures.

In the $VAGC_REDUX directory are a set of files whose rows are parallel (the same row in different files always refers to the same object):

Each row refers to the same object in all of the files; that is, the data in row 10 of object_twomass.fits is the 2MASS data available for the object in row 10 of object_sdss_imaging.fits.

Note that the above files only contain the objects in each survey which match entries in object_sdss_imaging. In order to access the other objects from the survey in question, you can look in files of the form:


See the documentation for the "object_" files for more details.

Some derived quantities are also available for each object in subdirectories. Currently these consist of:

Description of the geometry of the catalog

For the SDSS imaging survey we have an expression for its geometry in terms of spherical polygons.

However, for most people who are worried about the angular selection function, it is best to use a large-scale structure subsample, where the imaging, target, and tiling masks are combined and the flux limit and completeness are tracked as a function of position. In the $LSS_REDUX/dr72 directory are the files:

The geometry is built out of two separate files, one expressing the window and the other expressing the bright star mask:

In addition there are a number of subdirectories. Each subdirectory corresponds to a "pre-redshift" selection criterion; this means that the objects and the area of sky have been selected according to flux limit, completeness, and other properties which do not require knowing the object redshift. Further subdirectories provide complete subsamples based on cuts made after the redshift determination: on redshift, luminosity, intrinsic color, etc. Further details on the large-scale structure samples are available.

Software tools

We use a large suite of tools in order to create this catalog. The basics are all in idlutils. Public tools for dealing with the spectra are in idlspec2d. Public tools for dealing with the imaging data are in photoop. These are all IDL products, and you need IDL to use them. But you are an astronomer, and probably need IDL anyway (sigh).

Deserving of special mention are several tools within the latest version of idlutils:

A special piece of code that has been useful to use (and is compiled into idlutils) is mangle, a set of tools for dealing with angular masks developed by Andrew Hamilton and Max Tegmark. An old version of this code (v1.4) is distributed with idlutils.

Datasweeps of full SDSS catalog

One of the very useful tools that we use to build the NYU-VAGC are the "datasweeps" of the full SDSS catalog. These are compressed versions of the full catalog that have only the decent detections. They occupy a very small amount of disk space (about 90G) so are usefully small. They do not have all of the photometric information in them, though, which is a drawback.

All of the data for DR7 is at: http://sdss.physics.nyu.edu/datasweep/dr7/

There are two catalogs, split into stars and galaxies. For stars, the datasweeps have any stars where the PSF extinction-corrected magnitudes are brigher than the following for at least one band: u < 22.5, g < 22.5, r < 22.5, i < 22, z < 21.5. The catalog is broken down into runs and camcols, and is in files with the names:


Similarly, for galaxies , they have any galaxies where the model extinction-corrected magnitudes are brigher than the following for at least one band: u < 21.0, g < 22.0, r < 22.0, i < 20.5, z < 20.1. The names of the files for each run and camcol are:


The meanings of the columns of the FITS files conform to the naming conventions at the Princeton reductions site. Important things to remember are:

Known problems

NYU Value-Added Galaxy Catalog

Please contact us with comments or questions.

Valid XHTML 1.1  Valid CSS!