Reducing Data from Fibre-fed Instruments

General outline

The data you will get from the observatory will consist of science/object frames and calibration frames. Calibration frames should at minimum include:

  • bias frames
  • flat-fields (lamp/dome, possibly and/or twilight)
  • arc frames (for wavelength calibration)

but may additionally include:

  • dark frames
  • observations of a flux standard source, a telluric standard (for very accurate sky line identification/subtraction) and/or a radial velocity standard source (for "zero-pointing" of radial velocity measurements)
  • blank-sky observations (for removing the background "sky" flux from your science observations) — this depends on the configuration of the individual IFU
  • twilight flats (for correcting fibre-fibre throughput differences: can be done with dome flats but better with twighlight)

Some IFUs have dedicated arrays (or single lenslets/fibres) that simultaneously observe an offset sky region. For those that do not, and if removing the background emission is important to achieving your science goals, you should probably have included separate blank sky observations in your science target list. Emission line contamination of sky spectra can often be a problem, and is addressed below.

For observatories which provide queue scheduling, the observations of the flux calibration sources can sometimes be a day or so separated from your observations. Bear this in mind if you think it is important to have the flux standard observed through the same atmosphere as your target.

Most IFSs have a customised pipeline for reducing the data, and the advice offered here is not meant to replace these. Pipelines are tools; they should not be treated as blackboxes, you should learn how to use them properly. Read the manual, read the execution scripts, read a few papers that have worked with the same data as yours, and don't be afraid to experiment…

The basic steps involved in (fibre-fed) IFS data reduction are:

  1. Subtract the bias (a standard task not detailed further here)
  2. Remove the cosmic ray hits
  3. Identify where the spectra (fibres/apertures) lie on the CCD. This is usually done using a flat-field image (high signal-to-noise ratio) by taking a slice perpendicular to the dispersion direction, identifying where the peaks are (corresponding to each aperture/spectrum), then tracing each aperture along the dispersion direction (usually assuming a Gaussian shape). The section on aperture tracing below describes this process in a little more detail.
  4. Using these traces as a template to identify where the spectra lie on the other frames, and usually allowing for a global shift in aperture positions, extract the flat-field, arc and science spectra. These are normally then held in a "row stacked spectra" (RSS) format, i.e. the extracted spectra from a single raw image are stacked on an output image. Originally spreading over multiple pixels in the non-dispersion axis, they now, after extraction, are only 1 pixel wide.
  5. Determine a wavelength solution from the arc spectra (usually solved for each aperture spectrum) and apply that calibration to science (+sky +twilight/flat-field) frames. It is usually a good idea to make it so that all spectra cover the same wavelength range (meaning some will have zero-flux ends as the spectra on the CCD usually do not all cover exactly the same range) and have the same dispersion.
  6. If you have twilight frames extract also their spectra to make the "throughput" frame — failing twilight frames, use the dome flats. Throughput frames are to correct for fibre-to-fibre throughput differences (spatial, for sure, sometimes also spectral). Divide the science (+sky) frames by these throughput frames. Sometimes it is necessary to re-calibrate this throughput correction using sky lines on the science frames — see below.
  7. Subtract the (usually summed/averaged) sky spectrum, being careful about the effect of any changes in the PSF or wavelength with position on the CCD (which can lead to badly subtracted sky lines).
  8. Determine a flux calibration and apply it to the science spectra.
  9. Perform any further calibrations/corrections as required (e.g. correct for DAR) — see Further Reductions.

Cosmic ray removal

Removing cosmic rays (CRs) from your data can prove a tricky job (and sometimes can be just a tad boring), but is essential. It can often requires intensive checking, especially if you are working with emission line spectra — emission lines on raw IFS data can look very much like CRs (small, round, sharp) and so you have to find a balance that is vigourous enough to remove most of the CRs, but not so much that it also removes real features.

The best software routines for dealing with CRs on single images that we have found (out of a good half-dozen out there) are

L.A.Cosmic runs within IRAF or IDL and is extremely thorough, but is very slow. The MIDAS routine also works extremely well. The DRC routine is very fast, but you need to check the outputs to verify that skylines have not been removed too.

If you have multiple (>3) exposures of the same object then an image comparison technique can be a good method for removing CRs. This technique involves computing the median of all images and then comparing, pixel-by-pixel, each image to the median. Where differences (above a specified sigma) occur, the pixel value(s) is(are) replaced by the median value. With this technique, however, one has to watch out for (i) changes in the observing conditions (e.g. clouds) under which the images in the collective were taken. This will change the count rate and the median value may not be valid for an individual image in your collective, and (ii) shifts in the position of the telescope between the exposures — either real or due to DAR (see here). This will change the layout of the spectra on the CCD (because the spaxels are now looking at different parts of the sky) which will hinder the image-comparison technique that is used here.


Aperture tracing

Spectra are usually arranged on the CCD in a manner that most efficiently makes use of the detector size/area. Before these spectra can be extracted (summed and converted into 1D arrays), their exact positions on the CCD must be accurately mapped. This is the process of aperture tracing.

The positions of the apertures (spectra) are identified by extracting a slice perpendicular to the dispersion direction, then each aperture is traced along the dispersion direction by fitting a Gaussian function to the light peak as the algorithm moves along the detector. This is done from a flat-field image (high S/N and uniformly illuminated), and the aperture mapping is then used as a reference for all other images (arc frames, science frames, etc.).

Individual spectra on the raw CCD (dispersion direction left-right). Each spectrum needs to be traced along the dispersion direction before it can be extracted.
An example screenshot from the aperture tracing algorithm in the girBLDRS VLT FLAMES/ARGUS data reduction process. It shows Gaussian fits to peaks in the non-dispersion direction at different parts of the CCD.

Aperture tracing is not a particularly difficult task. For more complex instruments some more effort is required, such as SAURON or VIMOS in its low resolution "multiplexed" setting.

Additional info

The spacing of the spectra (perpendicular to the dispersion direction) is dictated by technical issues such as sampling (how many spaxels do you want versus how many spectra can you fit on the detector), aberrations, CCD "bleeding" (e.g. from bright emission lines or saturated stars) and the allowable cross-talk between adjacent spectra. Cross-talk can be tolerated if neighbouring elements of the slit are also adjacent on the sky. Therefore intensity variations between adjacent parts of the sky will be reflected in the way in which the light from adjacent slit elements overlap spatially in the spectrograph. This is, unfortunately, rarely the case in reality (the spaxels and apertures may be matched along one directional axes, but rarely along both). In general, the spectra should be packed together as closely as possible on the CCD so that the distribution of light on the detector is similar to that which would have been obtained from a longslit directly sampling the sky.

In some cases, adjacent spectra must be separated by a clean gap. This is particularly true if you expect spectra adjacent on the detector to have bright spectral features, as you will want to avoid "bleeding" of the light from one spectrum to the adjacent one. This will be important also if your (fainter) sky spaxels, for which you wish to have flux coming only from the background field (and thus will be well separated on the sky from your source field), are on the detector placed next to the (brighter) source spaxels. Another reason for having well separated spectra is that it will be necessary to have clear areas of the detector from which to estimate the background signal caused by scattered light within the spectrograph.

The placing of the spectra on the CCD is determined by the instrument design, so there will be nothing that you, as a user, can do to change this.


Wavelength calibration

A quick note on the IRAF sequence of wavelength calibration tasks, which are: identify (wavelength calibrate a single spectrum in your RSS image, usually the central one, setting in the process the function and order you wish to use for the wavelength solution calculated later for the rest of the spectra in the RSS); setjd (to set various header parameters that subsequent tasks need, in particular time parameters that are necessary for dispcor to work); reidentify (to apply the wavelength solution worked out in identify to the rest of the spectra in the RSS, working out first the shift with the spectrum you ran identify on, applying this shift, and then recalculating the fit based on the fit parameters you set in identify); refspectra (to say which arc frames "belong" to which astronomical frames by adding the header parameters REFSPEC[1,2]); dispcor (to then transfer the wavelength solution from the arc frame to the astronomical frames). These are stubborn tasks, in particular reidentify, and it is highly recommended you run it in interactive mode. The reason for this is that if, as it is going through all the spectra, reidentify can not find a (good) fit for a spectrum it doesn't say "crap fit found" but rather reports a bad RMS and/or tells you that (only) "1/11 lines" were used in the fit. A bad RMS is not just a high RMS — an extremely low RMS is also a classic indication that too few spectral lines were fit (i.e. if you have a 2rd order fit it can fit 3 data-points perfectly). When reidentify finds a bad fit it in fact has found no fit, as the fit details for this spectrum are not recorded in the reidentify output, so when you then run dispcor it will die, saying that "not all the apertures have been identified". Therefore, as you go through reidentify, make sure that when the interactive mode output indicates a bad fit has been found, you ask to look at that fit and adjust it to force a solution through (e.g. by adding more lines to the wavelength solution, or more commonly by removing a bad line).


Throughput correction

The throughput correction is done to correct for variations in the responsivity of the lenslets/fibres that fed the light from the IFU to the detector. As with traditional imaging, a twilight exposure is a good choice for your throughput image. However, the traditional flat-fielding which corrects for gross CCD variations and dust balls is not done.

For the throughput frame (consisting in RSS format of the extracted and wavelength calibrated throughput spectra) the mean spectrum is calculated and then divided into each individual spectrum (taking care to account properly for any zero-flux ends of spectra). Sometimes it is also necessary to fit these ratio spectra — if they are noisy or contain features than you think are only intrinsic to the throughput frame, and not any other frames (i.e. not also the science frames). Fit with a polynomial of order that reflects the true wavelength-dependent response variation of your instrument (the instrument manual or observatory scientists should be able to help you on this). Then normalise this new throughput frame to a mean of 1.

This throughput frame is then divided into the science frames. If you have sky emission lines in your science frames (e.g. the bright [OI] feature at 5577${\rm \AA}$) it is worth fitting these (with a simple Gaussian) and then plotting line flux vs. aperture number. If there are still wiggles in this plot (and in particular, if the same shape of wiggles is found for other science frames taken on the same night) then the throughput correction needs an additional correction. At the same time you can note whether the level of sky line flux variation over the apertures is in accord with Poissonian errors (i.e. is the standard deviation of the mean equivalent to sqrt(counts)?). The additional correction is done simply by dividing the previously throughput-corrected science frames by the additional correction. (i.e., make a spectrum that is aperture number in one direction and value of 5577-flux in the other; mean-normalise this spectrum; expand this spectrum such that it is a 2D frame with axes of aperture number in one direction and wavelength in the other, with values at each wavelength being the mean-normalised 5577-line flux.)


Sky/background subtraction

When the object of interest completely fills the FoV a separate field dedicated to sampling the sky background may be required. This can sometimes be observed simultaneously with the science data, or may need to be a separate observation (depending on the instrument capabilities). The sky field should sample blank sky as close as possible to the object field.

Ideally, the object and background fields should share the same optical path. One way to do this is to duplicate the entire optical train, linking them only at the output slit. In a fibre-fed system, there might be two separate arrays whose fibres are mixed together at the spectrograph slit. The background and object fibres would be grouped together in blocks to avoid cross-talk between unrelated regions of the sky, but be alternated in position within the slit so that any errors related to the position within the spectrograph field cancel out.

There are subtle problems with sky-subtraction using fibres arising from possible variations in the details of the PSF between fibres (see our page here) due to high-order FRD-type effects. These may affect the detailed line profiles so that the subtraction of a bright sky line in one spectrum from that in another may give a significant non-zero residual even when variations in the throughput (as a function of wavelength) of each element is calibrated out. One way to eliminate this effect is to beam-switch between the object and background fields (see here), but this observing technique is not always an option.

In doing the sky subtraction, you might want to simply combine all the sky spectra available in order to get a median sky and then subtract it from the spectra; or you might want to shift or scale the sky spectra in order to correct for spectrograph shifts or background level differences. If a direct sky-spectrum subtraction is not possible, you could instead simply use the sky spectrum to identify what sky lines there are, and then fit these together with your astronomical lines when making your science measurements.

In some cases, where the object is relatively small (e.g. a distant galaxy a few arcsec in extent), it will also be possible to derive the background signal from blank regions within the object field itself (in a similar way to what you might do with a long-slit).


Contaminated sky spectra

  • Rejection of affected sky fibres (best for DensePak-style IFUs with single, spaced out dedicated sky fibres)
  • Manual removal of offending emission lines (IRAF splot 'j' command)
  • If problem is bad, you might just have to do without — and learn for next time!


Flux calibration

Firstly, yes, IFS data can be flux calibrated!

After a throughput correction has been applied, the data can be flux calibrated. The throughput correction corrects for the differential response of the lenslets+fibres, while the flux calibration calibrates the absolute response of the whole instrument.

Your flux standard observations should be reduced in exactly the same way as your science exposures. The spectra pertaining to the standard source should then be extracted and summed. If your standard star observation is of sufficiently high S/N (usually the case) then you can safely sum all the spaxels in the FoV (after subtracting the sky if significant). If this isn't the case, then you will need to perform a more careful point-source extraction.

Additional complications arise when the spaxels are not spatially contiguous on the sky and flux is lost between the gaps. This is the case for a number of older IFUs — more info can be found about this on their specific data reduction pages (e.g. Integral).

The flux calibration procedure for optical IFUs is then essentially exactly the same as what you would apply to a long-slit spectrum:

  1. Compare the observed summed standard star spectrum to a calibrated one to calculate the conversion from counts/s to flux units.
  2. Apply that conversion to *each* spectrum in your science frame to calibrate your science data.

In the near-IR, the procedures are slightly different, but also follow the steps usually done for long slit observations, where you first correct for telluric absorption lines, and secondly use a spectrophotometric observation to do the flux calibration using a scaled model for the star's spectral type.


General data-reduction software for fiber-fed IFSs

Various tools are used to reduce IFS data of fiber-fed instruments. General tools, which work with several instruments, include IRAF and R3D. Another tool is p3d that was first written for the PMAS lens array (Calar Alto) and MPFS (BTA/6m); it was later modified to handle several additional instruments. The original version of p3d was inflexible and difficult to use, although it included several unique tools and routines which nevertheless made p3d a useful tool, at times.

The new version of p3d is a complete rewrite of the original version. It has been designed to be very general. The current version supports the PMAS, VIRUS-P, and SPIRAL IFUs, although p3d can, without too large an effort, be adapted for use with any of the remaining fiber-fed IFS. Such work has already been initiated (for, e.g., VIMOS [only HR/MR], FLAMES/ARGUS, GMOS, IMACS, INTEGRAL, and MPFS), but exactly when these instruments will be fully supported depends on the interest of the community.

Currently p3d contains routines to handle the following five tasks: creation of a master bias, fully automatic (aperture) tracing, creation of a dispersion mask for wavelength calibration, preparation of data for flat-fielding (throughput correction), and spectrum extraction. Full error propagation is performed in all steps, and all operations are logged. Additional tasks can/may be added to p3d in the future. Examples of such improvements include combination of dithered images, correction for differential atmospheric refraction, removal of cosmic rays in individual images, and flux calibration. The spectrum extraction can, for all supported instruments, be done optimally, correcting for cross-talk (and this is also the recommended method for most instruments). There is not yet any manual for p3d (there has been no time yet to write one), although all routines are thoroughly described, and all actions in the p3d GUI(s) are explained on a separate status line in the respective GUI. The plan is to prepare tutorials on a dedicated wiki that will simplify the use of p3d further.

Everything about p3d can be learned from the project web site, including download and installation instructions. p3d is released under GPLv3 and the project web site is hosted by The functionality of p3d is described thoroughly and completely in the recently accepted A&A-paper (Sandin et al. 2010). A link to where the paper can be downloaded can also be found at the project web site.

If you would like to help us and improve (some part of) p3d then please contact us (contact details can be found at the project web site). Examples of such improvements could be additional instrument support, parameter corrections, additional routines/tools, manual pages, or even tutorial sections.

/The p3d team

28.1.2011: p3d version 2.0 alpha has now been released. This new release includes support for a large number of additional instruments. The alpha release will, eventually, be replaced with first a beta release, and then the final version 2.0. Please see the p3d web site for more information. /CS


Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License