The Science Requirements Analysis Document (ADO1), specifically sections 3.4 and 4, lays out the conditions for user access to the WSA. Here we mainly concentrate on how the V1.0 specified functionality for accessing survey data will be implemented.
Users of the WSA will mainly be interested in access to the stored pixel data (FITS images) and/or the generated object catalogues. The way the data are held in the archive and the different software required to process user requests lead us to consider these two types of access in separate sections.
Section 4 discusses the user interface to the pixel data splitting it up into the different ways in which users will be able to extract image data and the underlying software required. The section also gives details of relevant experience in serving pixel data.
The user interface providing access to the object catalogue data is similarly broken down and the subject of Section 5. Links to prototypes demonstrating some of the planned functionality are provided.
Ensuring the archive interface conforms to emerging Virtual Observatory standards and housekeeping issues are briefly discussed in Sections 6 and 7.
Section 8 lists the documentation that will be provided to assist the user.
Initial access to information on the WSA and to the data will be via the website maintained by the WFAU.
The WSA website will consist of
Different ways for accessing the pixel data are described below with a schematic representation given in Fig 1. The functionality offered by them will meet the specified requirements of V1.0, with Sections 4.3 and 4.4 covering some aspects of V2.0.
The primary access route for users will be via web forms on the WSA website. Password protected versions of the forms will restrict access to the data not yet classed as world readable, i.e. the SQL queries discussed below will be constructed so as to only return and/or pass on image data based on the level of user. The CGI scripts actioned by the forms will be written in Perl as it offers a good way of wrapping up the steps required to perform the extractions. Speed and efficiency is more important in the low level manipulation of the binary data and this will be carried out using C/C++.
Boxes and menus on the form will allow the user to specify the celestial coordinates of the extraction (decimal degrees or sexagesimal), the coordinate system (J2000, B1950, Galactic, SDSS), the size of the area to extract (x,y in arcmin), the waveband, and type of image to use (unstacked, DXS, UDS, difference images) and whether or not to return a GIF/JPG image in addition to a link to the extracted FITS file. There will also be an option to attach a default object catalogue as a binary table extension to the FITS file.
When submitted this form will action a Perl CGI script on the server. A summary of the tasks the script will perform is given below and the associated workflow is shown in Fig 2.
It is anticipated that small image extractions will be completed in real time (under 30 seconds). They will be run at higher priority than the background processes described later.
Uploading a file of coordinates to this multi-part form will provide the user with a batch mode front end to the functionality offered in Section 4.1. Users will be asked to supply the path of the upload file, the coordinate system used, size of extraction, waveband, whether they want catalogue data attached and a valid email address. Users will also be able to request postscript gray-scale plots of the extracted images.
A limit will need to be placed on the total amount of pixel data that can be extracted. Batch access to the SuperCOSMOS Sky Surveys (SSS) is restricted to a total of 1000 sq. arcmin.
The Perl CGI script actioned by this form will carry out similar steps to those described above in Section 4.1 as it loops through the input file. Output will be written to a logfile and if a given image can not be extracted the script will skip to the next one. After initial checking of the input parameters the actual extractions will be run in the background with a message being returned to the browser informing the user that an email will be sent to them when the script has finished.
If requested the postscript finders will be generated prior to the individual FITS being tarred up and gzipped. On completion an email will be sent to the user giving links to their results.
This form will enable users to construct an image from multiple CCD frames covering an area of sky up to 0.8 degree across. However the underlying mosaicing software will be able to generate arbitrarily large areas (a V2.0 requirement).
The inputted parameters will be the same as PF1 (Section 4.1) but with the addition of a text box to set the pixel size (arcsec) of the returned image and a box to enter the user's email address.
This form will again action a CGI script. The main difference in the processing steps outlined in Section 4.1 is that the initial SQL query constructed and sent to the image database table will return the path/filenames of all images held that overlap with the area of sky requested. This list of files together with the size and resolution requested for the output image will in turn be passed to a Perl mosaicing script (see below) that will mosaic the files together into a single image. As a large amount of pixel re-sampling can be involved the main work will be done in the background with the user being notified by email on completion.
Mosaicing Script - each image passed to the script will be re-mapped to another FITS image whose WCS system is defined by the pixel size and area of sky requested. Once all images have been processed they will then be merged down into a single FITS image. See Section 5 of the WSA Data Flow Document (VDF-WFA-WSA-005) for more detail.
The functionality described in this section is a V2.0 requirement. It is included here as there are similarities to the procedures detailed in the previous section.
After submitting user inputted values for position and waveband this form will dynamically action the creation of another form that allows the user to select which frames they wish to use in the construction of a stacked image.
Following the parsing of the input parameters (first 3 steps of Section 4.1) the Perl CGI script actioned here constructs and submits an SQL query that returns a list of images that cover the requested area. The script then generates a web form that is returned to the browser. This form will list all the images and allow the user to tick check-boxes alongside each one. An email address will also be required. When submitted this form in turn actions another Perl CGI script which receives the list of selected files and passes them to the Stacking Script. Again the intensive processing is done as a background task with an email notification being sent on completion. The stacking procedure is described in Section 5 of the WSA Data Flow Document (VDF-WFA-WSA-005).
The links to the data will be dynamic in that direct access to the files will not be permitted so the link will run a CGI script that copies the requested file to a publicly accessible area on the server and returns that link.
Public (world accessible) versions of these pages will show all archive contents but the links to unreleased data will be grayed out (not active) and marked as currently unavailable.
The same functionality will also be offered in the form of C-code that users can download, compile and run from the command line. The code will submit an HTTP request to a CGI script and retrieve the results. Again the underlying CGI script will be very similar to that of Section 4.1. Users will be able to create their own batch extractions by wrapping the supplied code up in a script. An example command line invocation would be:
geturl "ra=10:11:12.1&dec=+40:50:10&x=1&y=1&waveband=J&objcat=no" myfile.fits
The methods outlined in this section will only access data classed as world readable.
However the online SuperCOSMOS Sky Surveys (SSS) particularly
perform much the same functionality described in Sections 4.1 and 4.2 albeit without the SQL query stage.
Access to SSS data has been made available from within GAIA (under Data-Servers Browse Catalog Directories... and open/expand SuperCOSMOS catalogues) and Aladin (under load image servers others).
The geturl code and functionality described in Section 4.6 has been supplied to the 6dF observers who routinely use it to generate SSS finders for checking target objects (3000-10000 extractions per week).
Screenshots of some of the access methods described above are provided in the Appendix (Section 9).
The processes listed above are also shown in the form of a sequence diagram in Fig 3.
Note that the intensive part of any query is carried out by SQL server. Much of the differences in functionality of the access methods described below lie in the construction of the SQL query, the code for executing the query is largely generic and re-usable.
The amount of data returned by a query and written to file will need to be restricted. Experience with the SSS showed 1,000,000 records (10 UK Schmidt fields) to be a sensible limit for that system.
The implementation of V1.0 functionality is described below. Again the primary access method will be via web forms. UKIDSS data not yet classed as world readable will be password protected with the unrestricted forms only able to connect to the world readable database. The other access methods that will be offered are also discussed. Most of these methods will by default access the latest merged UKIDSS survey tables.
The actioned servlet will first convert the inputted coordinates or resolve the inputted name (using SIMBAD or NED) to an RA and DEC in J2000 decimal degrees. An SQL query will then be formed around an SQL function (based on fGetNearbyObj from SDSS Skyserver) that efficiently searches through the table using the HTM index.
SELECT UKIDSS_TABLE.* (or subset, or inputted parameters) FROM UKIDSS_TABLE u, fGetNearbyObj(RA,DEC,radius) n WHERE u.objID=n.objID AND (inputted where clause) ORDER BY (inputted parameter(s))
The servlet will submit the query as a separate thread, with the main thread keeping the browser connection active and checking that it hasn't been stopped by the user.
On completion the query thread parses and formats the returned rows, writes the data to file if requested and prints any HTML table output and links to files to the browser window.
If requested, ellipse finder plots (GIF/JPG) will be produced for small area extractions (up to 10 arcmin) and output to the browser.
There will be two versions of this form. The first will use drop down menus and text boxes to guide the user in building an SQL query for execution by the servlet. Users will be able to choose/input values to construct the SELECT, FROM, WHERE and ORDER BY clauses of SQL. An option will enable the querying of a second table joined with the primary table. The secondary table can be one of the non-WFCAM based tables held in the archive e.g. SSS or SDSS. Joins will be made via the neighbours table for a given combination.
The second version is based around a text box into which users with knowledge of SQL and the contents of the WSA (which will be documented) can directly input their SQL query.
Output options for both versions will be as OF1.
This multipart form offers the user the ability to upload a list of coordinates (parsed using the O'Reilly servlet classes) and match them against tables held in the database.
Users will specify the pairing radius and whether they want just the nearest object or all matching objects extracted.
Unlike the previous object catalogue forms several SQL statements will be executed by the servlet. The first of these creates a temporary database table, that table is then populated with the contents of the upload file. Finally the requested database table and temporary table are paired. The temporary table is automatically dropped when the JDBC connection is closed. This method is a lot more efficient than issuing a query for each object whilst looping through the user's file.
As detailed in Section 4.5 the WSA pages will provide a browsable way to reach links to the object catalogues which are stored as FITS binary tables (generated from and associated with a given FITS image file by the CASU standard source extraction tool).
The functionality offered in Sections 5.1 and 5.2 will be made compatible with the GAIA and Aladin tools. Some of the options available on the web form will be hard-wired to sensible defaults for this implementation (e.g. the parameters returned) as the interface, especially with GAIA, is not fully configurable.
As described in Section 4.6 command line C-code will also be provided to the user allowing direct (non-browser) radial and rectangular queries.
Again the methods outlined in this section will only access data classed as world readable.
These forms access the PhotoPrimary table of the SDSS EDR held at WFAU on a basic PC. This table contains over 10 million records.
The radial form uses the CDS Sesame Java class and web service to resolve an input name to J2000 coordinates using Simbad.
Xmatching an uploaded file of 1000 records takes some 40-90 seconds.
The initial access to the 6dF Galaxy Redshift Survey offers similar functionality to that described in Section 5.3 see
Screenshots of some of the examples described above are provided in the Appendix (Section 9).
VOTable format will be one of the output options for the V1.0 object catalogue access. In moving towards V2.0 specified functionality the pixel and object catalogue access will be offered as XML/SOAP driven web services that will be published on the Astrogrid registry. Tools and standards are starting to emerge and will be adopted as they are finalised.
Log files of user access will be archived and used to monitor and generate statistics on the WSA usage (hits, queries, data volume served).
6dF : Six-degree field
6dFGS : Six-degree field Galaxy Survey
ADnn : Applicable Document No nn
CGI : Common Gateway Interface
CASU : Cambridge Astronomical Survey Unit
DBD : Database Driver
DBI : Database Interface
DXS : Deep Extragalactic Survey
LWP : LibWWW-Perl
HTML : HyperText Markup Language
HTTP : Hypertext Transfer Protocol
SOAP : Simple Object Access Protocol
SQL : Structured Query Language
SSS : SuperCOSMOS Sky Surveys
UDS : Ultra Deep Survey
UKIDSS : UKIRT Infrared Deep Sky Survey
VISTA : Visible and Infrared Survey Telescope for Astronomy
VPO : VISTA Project Office
W3C : World-Wide Web Consortium
WFAU : Wide Field Astronomy Unit (Edinburgh)
XML : eXtensible Markup Language
|AD01||Science Requirements Analysis Document||VDF-WFA-WSA-002
Issue: 1.0 2/04/03
|AD02||Database Design Document||VDF-WFA-WFCAM-007
Issue 1.0 2/04/03
|AD03||WSA Data Flow Document||VDF-WFA-WFCAM-005
Issue 1.0 2/04/03
|Issue||Date||Section(s) Affected||Description of Change/Change Request Reference/Remarks|
|WFAU:||P Williams, N Hambly|
|CASU:||M Irwin, J Lewis|
|UKIDSS:||S. Warren, A. Lawrence|
This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)
Copyright © 1993, 1994, 1995, 1996,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 VDF-WFA-WSA-008-I1
The translation was initiated by Mike Read on 2003-04-02