Organizer: Mike Blanton, with indispensable help from Daniel Eisenstein, Adrian Pope, David Schlegel, Max Tegmark, and Idit Zehavi
The SDSS Large-scale Structure Sample is a compilation of data on the SDSS spectroscopic survey of galaxies and QSOs, designed to be easily used for the study of large-scale structure. The design philosophy is basically to have the data organized in a sensible file and directory structure. It is basically a relational database, only expressed as a set of files.
An important general note: we ALWAYS zero-index everything. So the fourth line of a file will be indexed by "3".
Redshift Distribution
[medium version]
[large version]
Sky Coverage (targeted)
[large version]
The data is accessible through the "alfred" account on hola.cosmo.fas.nyu.edu, in the same manner as the 2D spectroscopic reductions at Princeton, as described here.
Most people will want the Large-Scale Structure samples. These are in the directories:
/global/data/sdss/lss/sample#/lsswhere "#" is the sample number.
The large-scale structure samples contain lists of galaxies, their positions, the selection function for each one, and a description of the boundaries of the sample. It also includes random samples for calculating N-point statistics. The basic versions to use are:
/global/data/sdss/lss/sample#/lss/sample#safe1 /global/data/sdss/lss/sample#/lss/sample#full1The details of each sample are in README files in each directory. Both samples contain only galaxies within a small absolute magnitude range. Basically, the difference between the two is that "safe1" uses a constant flux limit, whereas "full1" uses a position-dependent flux limit. Using a position-dependent flux limit allows one to use significantly more objects. I usually suggest that one use both and make sure your answer is consistent to make sure one has implemented the variable flux limit correctly.
Many of the details of the samples can be extracted from the README files given. However, if you want to know more about the objects in the samples (such as petroR90 or something), you don't have to match these lists to tsObj files or go to SX. All that information is contained in the sample8 directory tree, so you should read on to discover how to do this. Basically, each lss file gives an index number which tells you where in the "master" list to look for each object. The information on the objects in the master list is given in files in the fits/ and ascii/ subdirectories (ascii/ just gives the same info as the fits/ in an easier-to-read format).
Be careful if you want to define subsamples of the samples I distribute here. For example, you can't just define color subsamples of "safe1" and use the selection function weights given, because the different subsamples will have different selection functions. Talk to me if you want to do something like that.
Finally, if you have calculated things about the galaxies from the spectra and/or images, you can send me the files, I'll do the ra/dec matching (which takes only a few seconds) and I'll include the results of the matching in the fits/ subdirectory of the sample. That way, your statistic will be available for other people to use!
The data is accessible through the "alfred" account on hola.cosmo.fas.nyu.edu, in the same manner as the 2D spectroscopic reductions at Princeton, as described here. The description that follows here applies to sample8 and beyond (and more-or-less to sample7, but not in detail).
As explained further below, there are basically two versions of the photometric data, corresponding to different reductions. There is the data reduction which was used to target the objects (the "Drilled"), and there is the latest/greatest versions of the data reduction (the "Best"). For much of the data, these are actually identical. However, I carry around both sets of files separately (labeled "Drilled" or "Best"). For all the science *I* do, I simply use the "Best" reductions. However, it is potentially useful to have the "Drilled" reductions, so I keep them around.
The two fundamental files are:
ascii/idBest.sample#.datand
ascii/idDrilled.sample#.datThese contain the run/rerun/camcol/field/id numbers for all tiled targets in each target version (as well as other information, as described in the README.ascii file). To see the number of tiled targets in each version, just look at the number of lines in these files. A tiled target is an object which passes:
(primTarget & 33279) || (secTarget & 512)That is, it is one of the target types in primTarget:
AR_TARGET_QSO_HIZ = 0x1, AR_TARGET_QSO_CAP = 0x2, AR_TARGET_QSO_SKIRT = 0x4, AR_TARGET_QSO_FIRST_CAP = 0x8, AR_TARGET_QSO_FIRST_SKIRT = 0x10, AR_TARGET_GALAXY_RED = 0x20, AR_TARGET_GALAXY = 0x40, AR_TARGET_GALAXY_BIG = 0x80, AR_TARGET_GALAXY_BRIGHT_CORE= 0x100, AR_TARGET_STAR_BROWN_DWARF = 0x8000or in secTarget:
TAR_TARGET_HOT_STD = 0x200
Note that the "Drilled" and "Best" sets of data are incommensurate in the sense that line 10 of a "Best" file refers to a different object than line 10 of a "Drilled" file. However, within each set of "Drilled" or "Best" data, the data IS commensurate. That is, the data on line 10 of idBest.sample8.dat (for example) refers to the same object as that on line 10 of petroBest.sample8.dat, and to the same object on entry 10 of tsObjBest.sample8.fits. Some files will have extra entries on the end of the file, which correspond to non-matches to objects in idBest or idDrilled.
Of course, for the subsamples of tiled targets which will make up our science samples, it would be quite wasteful to follow this convention (many targets don't have spectra or will fall out of the sample for other reasons). Instead, in many cases we reference the master list using an index. That is, for each object in the subsample we give its position in the master list (zero-indexed!).
Here I give a cursory description of the data, with pointers to more detailed information. Here are the subdirectories of the sample# directories which contain the basic data:
The large-scale structure sample is a carefully aglomerated set of data from the SDSS spectroscopic data. As such, it always will lag behind a bit from the data itself, never quite using the current SDSS survey results. If you want the most current list of redshifts, with the associated tiling and photometric information, I store these on hola, as well.
Currently the redshifts are the Princeton redshifts, and are stored in "/global/data/sdss/specBS" on hola.cosmo.fas.nyu.edu. They are downloaded anew every night from sdssdata.princeton.edu. My feeling is that you should get them directly from there, or from the Fermilab group, who promises to make this file soon.
The photometric information for all the spectroscopic targets is stored in "/global/data/sdss/tsObjTargets". The READIT file contains the appropriate information for using this data.
Finally, the tiling information for all tiled targets is in "/global/data/sdss/tiling". Again, the README file has the information necessary to use the data.