next_inactive up previous


VISTA DATA FLOW SYSTEM
(VDFS)
---------------
for VISTA & WFCAM data

WSA User Interface Document

author
M.A. Read (WFAU Edinburgh)
Project assistant
number
VDF-WFA-WSA-008
issue
Issue 1.0
date
2 April 2003
co-authors
N.C. Hambly, I.A. Bond


Contents


SCOPE

This document describes the interface and underlying software that will sit between the data held in the WSA and the end user. In this context a user might be a member of UKIDSS, an astronomer outside the consortium or to a lesser extent a member of the public. However this document is aimed at software engineers and astronomers with knowledge of database access and web services.

The Science Requirements Analysis Document (ADO1), specifically sections 3.4 and 4, lays out the conditions for user access to the WSA. Here we mainly concentrate on how the V1.0 specified functionality for accessing survey data will be implemented.


INTRODUCTION

The WSA website will be the primary point of contact for users and Section 3 gives a summary of its construction and content.

Users of the WSA will mainly be interested in access to the stored pixel data (FITS images) and/or the generated object catalogues. The way the data are held in the archive and the different software required to process user requests lead us to consider these two types of access in separate sections.

Section 4 discusses the user interface to the pixel data splitting it up into the different ways in which users will be able to extract image data and the underlying software required. The section also gives details of relevant experience in serving pixel data.

The user interface providing access to the object catalogue data is similarly broken down and the subject of Section 5. Links to prototypes demonstrating some of the planned functionality are provided.

Ensuring the archive interface conforms to emerging Virtual Observatory standards and housekeeping issues are briefly discussed in Sections 6 and 7.

Section 8 lists the documentation that will be provided to assist the user.


WSA WEBSITE

Initial access to information on the WSA and to the data will be via the website maintained by the WFAU.


Construction

An Apache web server running under Linux and offering up CGI will host the WSA website. Tomcat will also be installed providing Java servlet functionality. All the site's pages and forms will be validated as HTML 4.01 compliant using the W3C validation service. This should ensure they work with a wide range of browsers on common operating systems. Specifically they will be tested using Microsoft Internet Explorer 5 under Windows and Netscape 4.76 running on Unix/Linux. Cascading Style Sheets (CSS) will also be used to build the website with this content again being W3C validated. Initially Javascript will not be used though if the functionality it offers in, for example, writing dynamic access forms outweighs potential problems with browser compatibility this might be re-visited.


Content summary

The WSA website will consist of


PIXEL DATA

As detailed in document VDF-WFA-WSA-007 the WSA will store the pixel data in multi-extension FITS files in a flat file system. Tables in the SQL database will hold the meta data extracted from each image and will also record the path/filename of the corresponding FITS file. The SQL database resides on a separate machine (Windows 2000 PC) that is networked to the web server. The FITS files are on disks also visible to the web server.

Different ways for accessing the pixel data are described below with a schematic representation given in Fig 1. The functionality offered by them will meet the specified requirements of V1.0, with Sections 4.3 and 4.4 covering some aspects of V2.0.

Figure 1: Schematic overview of pixel data access
\includegraphics[width=4in]{cdr.eps}

The primary access route for users will be via web forms on the WSA website. Password protected versions of the forms will restrict access to the data not yet classed as world readable, i.e. the SQL queries discussed below will be constructed so as to only return and/or pass on image data based on the level of user. The CGI scripts actioned by the forms will be written in Perl as it offers a good way of wrapping up the steps required to perform the extractions. Speed and efficiency is more important in the low level manipulation of the binary data and this will be carried out using C/C++.


Small image extraction - pixel form 1 (PF1)

This form will enable users to extract an image from a SINGLE extension to a stored FITS file. The maximum size extracted will therefore be limited by the area covered by the extension accessed.

Boxes and menus on the form will allow the user to specify the celestial coordinates of the extraction (decimal degrees or sexagesimal), the coordinate system (J2000, B1950, Galactic, SDSS), the size of the area to extract (x,y in arcmin), the waveband, and type of image to use (unstacked, DXS, UDS, difference images) and whether or not to return a GIF/JPG image in addition to a link to the extracted FITS file. There will also be an option to attach a default object catalogue as a binary table extension to the FITS file.

When submitted this form will action a Perl CGI script on the server. A summary of the tasks the script will perform is given below and the associated workflow is shown in Fig 2.

Figure 2: UML activity diagram showing the flow of events and control in the extraction of a small image by the user. This workflow involves processes carried out by the web browser client, the server-side CGI script, the database manager system, and the operating system. The roles of these players are depicted by separating out the activity diagram into the "swimlanes" as shown.
\includegraphics[bbllx=41pt,bblly=225pt,bburx=561pt,bbury=779pt]{activity.ps}

It is anticipated that small image extractions will be completed in real time (under 30 seconds). They will be run at higher priority than the background processes described later.


Batch small image extraction - pixel form 2 (PF2)

Uploading a file of coordinates to this multi-part form will provide the user with a batch mode front end to the functionality offered in Section 4.1. Users will be asked to supply the path of the upload file, the coordinate system used, size of extraction, waveband, whether they want catalogue data attached and a valid email address. Users will also be able to request postscript gray-scale plots of the extracted images.

A limit will need to be placed on the total amount of pixel data that can be extracted. Batch access to the SuperCOSMOS Sky Surveys (SSS) is restricted to a total of 1000 sq. arcmin.

The Perl CGI script actioned by this form will carry out similar steps to those described above in Section 4.1 as it loops through the input file. Output will be written to a logfile and if a given image can not be extracted the script will skip to the next one. After initial checking of the input parameters the actual extractions will be run in the background with a message being returned to the browser informing the user that an email will be sent to them when the script has finished.

If requested the postscript finders will be generated prior to the individual FITS being tarred up and gzipped. On completion an email will be sent to the user giving links to their results.


Large image extraction - pixel form 3 (PF3)

This form will enable users to construct an image from multiple CCD frames covering an area of sky up to 0.8 degree across. However the underlying mosaicing software will be able to generate arbitrarily large areas (a V2.0 requirement).

The inputted parameters will be the same as PF1 (Section 4.1) but with the addition of a text box to set the pixel size (arcsec) of the returned image and a box to enter the user's email address.

This form will again action a CGI script. The main difference in the processing steps outlined in Section 4.1 is that the initial SQL query constructed and sent to the image database table will return the path/filenames of all images held that overlap with the area of sky requested. This list of files together with the size and resolution requested for the output image will in turn be passed to a Perl mosaicing script (see below) that will mosaic the files together into a single image. As a large amount of pixel re-sampling can be involved the main work will be done in the background with the user being notified by email on completion.

Mosaicing Script - each image passed to the script will be re-mapped to another FITS image whose WCS system is defined by the pixel size and area of sky requested. Once all images have been processed they will then be merged down into a single FITS image. See Section 5 of the WSA Data Flow Document (VDF-WFA-WSA-005) for more detail.


Stacked image generation - pixel form 4 (PF4)

The functionality described in this section is a V2.0 requirement. It is included here as there are similarities to the procedures detailed in the previous section.

After submitting user inputted values for position and waveband this form will dynamically action the creation of another form that allows the user to select which frames they wish to use in the construction of a stacked image.

Following the parsing of the input parameters (first 3 steps of Section 4.1) the Perl CGI script actioned here constructs and submits an SQL query that returns a list of images that cover the requested area. The script then generates a web form that is returned to the browser. This form will list all the images and allow the user to tick check-boxes alongside each one. An email address will also be required. When submitted this form in turn actions another Perl CGI script which receives the list of selected files and passes them to the Stacking Script. Again the intensive processing is done as a background task with an email notification being sent on completion. The stacking procedure is described in Section 5 of the WSA Data Flow Document (VDF-WFA-WSA-005).


Browsable access to pixel data

The WSA website will have tables and charts displaying the contents of the archive. These pages will be generated automatically by periodically running Perl scripts that perform SQL queries on the various image database tables. The pages will be split up into the various surveys and further sub-divided on sky position if the tables get too large. The tables themselves will contain links to the stored image FITS files, the associated catalogue FITS files and the compressed atlas images (JPGs).

The links to the data will be dynamic in that direct access to the files will not be permitted so the link will run a CGI script that copies the requested file to a publicly accessible area on the server and returns that link.

Public (world accessible) versions of these pages will show all archive contents but the links to unreleased data will be grayed out (not active) and marked as currently unavailable.


Other access to pixel data

The functionality offered in Section 4.1 will be made compatible with the GAIA (based on SkyCAT) and Aladin tools. The underlying Perl CGI script will be very similar, (or indeed possibly the same but with different options being executed). The main difference being that the generated FITS image is piped directly back to the querying tool.

The same functionality will also be offered in the form of C-code that users can download, compile and run from the command line. The code will submit an HTTP request to a CGI script and retrieve the results. Again the underlying CGI script will be very similar to that of Section 4.1. Users will be able to create their own batch extractions by wrapping the supplied code up in a script. An example command line invocation would be:

geturl "ra=10:11:12.1&dec=+40:50:10&x=1&y=1&waveband=J&objcat=no" $>$ myfile.fits

The methods outlined in this section will only access data classed as world readable.


Demonstrations of pixel access

Currently there is no demonstration available that makes use of mocked up SQL tables of existing FITS images.

However the online SuperCOSMOS Sky Surveys (SSS) particularly

http://www-wfau.roe.ac.uk/sss/pixel.html

and

http://www-wfau.roe.ac.uk/sss/batchfile.html

perform much the same functionality described in Sections 4.1 and 4.2 albeit without the SQL query stage.

Access to SSS data has been made available from within GAIA (under Data-Servers $>$ Browse Catalog Directories... and open/expand SuperCOSMOS catalogues) and Aladin (under load $>$ image servers $>$ others).

The geturl code and functionality described in Section 4.6 has been supplied to the 6dF observers who routinely use it to generate SSS finders for checking target objects (3000-10000 extractions per week).

Screenshots of some of the access methods described above are provided in the Appendix (Section 9).


OBJECT CATALOGUE DATA

Object catalogue data are stored in SQL server database tables on a Windows machine networked to the web server. The basic recipe for access is described below:

The processes listed above are also shown in the form of a sequence diagram in Fig 3.

Figure 3: Sequence diagram representation of user access to object catalogue data.
\includegraphics[bbllx=41pt,bblly=470pt,bburx=561pt,bbury=779pt]{sequence.ps}

Note that the intensive part of any query is carried out by SQL server. Much of the differences in functionality of the access methods described below lie in the construction of the SQL query, the code for executing the query is largely generic and re-usable.

The amount of data returned by a query and written to file will need to be restricted. Experience with the SSS showed 1,000,000 records (10 UK Schmidt fields) to be a sensible limit for that system.

The implementation of V1.0 functionality is described below. Again the primary access method will be via web forms. UKIDSS data not yet classed as world readable will be password protected with the unrestricted forms only able to connect to the world readable database. The other access methods that will be offered are also discussed. Most of these methods will by default access the latest merged UKIDSS survey tables.


Radial Search - Object Form 1 (OF1)

This form will enable a user to extract objects within a specified distance of a supplied position (RA/DEC, Galactic, SDSS lambda/eta) or object name. Other options passed from the form to the servlet will be: which UKIDSS survey to search (or all), which parameters to extract (all, subset or user specified), any additional constraints (used to form SQL WHERE clause), an option to sort the output on a supplied parameter(s), the format of the output data (HTML, delimited ascii file, binary FITS table, VOTable).

The actioned servlet will first convert the inputted coordinates or resolve the inputted name (using SIMBAD or NED) to an RA and DEC in J2000 decimal degrees. An SQL query will then be formed around an SQL function (based on fGetNearbyObj from SDSS Skyserver) that efficiently searches through the table using the HTM index.

e.g.

SELECT UKIDSS_TABLE.* (or subset, or inputted parameters) 
FROM UKIDSS_TABLE u, fGetNearbyObj(RA,DEC,radius) n 
WHERE u.objID=n.objID AND (inputted where clause) 
ORDER BY (inputted parameter(s))

The servlet will submit the query as a separate thread, with the main thread keeping the browser connection active and checking that it hasn't been stopped by the user.

On completion the query thread parses and formats the returned rows, writes the data to file if requested and prints any HTML table output and links to files to the browser window.

If requested, ellipse finder plots (GIF/JPG) will be produced for small area extractions (up to 10 arcmin) and output to the browser.


Rectangular Search - Object Form 2 (OF2)

Similar to the radial search but this time an SQL function is used that extracts objects bounded by limits in RA and DEC. If the user specifies a coordinate system other than RA and DEC (J2000) the rectangle extracted is such that it encloses the one requested in the inputted coordinate system. Extra objects returned by the query can be removed by the servlet though doing this at the SQL stage (e.g. by writing SQL functions to perform coordinate transforms) will also be investigated .


SQL query - Object Form 3 (OF3)

There will be two versions of this form. The first will use drop down menus and text boxes to guide the user in building an SQL query for execution by the servlet. Users will be able to choose/input values to construct the SELECT, FROM, WHERE and ORDER BY clauses of SQL. An option will enable the querying of a second table joined with the primary table. The secondary table can be one of the non-WFCAM based tables held in the archive e.g. SSS or SDSS. Joins will be made via the neighbours table for a given combination.

The second version is based around a text box into which users with knowledge of SQL and the contents of the WSA (which will be documented) can directly input their SQL query.

Output options for both versions will be as OF1.


Cross-matching a catalogue - Object Form 4 (OF4)

This multipart form offers the user the ability to upload a list of coordinates (parsed using the O'Reilly servlet classes) and match them against tables held in the database.

Users will specify the pairing radius and whether they want just the nearest object or all matching objects extracted.

Unlike the previous object catalogue forms several SQL statements will be executed by the servlet. The first of these creates a temporary database table, that table is then populated with the contents of the upload file. Finally the requested database table and temporary table are paired. The temporary table is automatically dropped when the JDBC connection is closed. This method is a lot more efficient than issuing a query for each object whilst looping through the user's file.


Browsable access to catalogue data

As detailed in Section 4.5 the WSA pages will provide a browsable way to reach links to the object catalogues which are stored as FITS binary tables (generated from and associated with a given FITS image file by the CASU standard source extraction tool).


Other access to catalogue data

The functionality offered in Sections 5.1 and 5.2 will be made compatible with the GAIA and Aladin tools. Some of the options available on the web form will be hard-wired to sensible defaults for this implementation (e.g. the parameters returned) as the interface, especially with GAIA, is not fully configurable.

As described in Section 4.6 command line C-code will also be provided to the user allowing direct (non-browser) radial and rectangular queries.

Again the methods outlined in this section will only access data classed as world readable.


Demonstrations of catalogue access

The core functionality described in Sections 5.1 and 5.4 are demonstrated in the prototypes at

http://www-wfau.roe.ac.uk/6dfgs/wsa/radial.html

and

http://www-wfau.roe.ac.uk/6dfgs/wsa/xmatch.html

respectively.

These forms access the PhotoPrimary table of the SDSS EDR held at WFAU on a basic PC. This table contains over 10 million records.

The radial form uses the CDS Sesame Java class and web service to resolve an input name to J2000 coordinates using Simbad.

Xmatching an uploaded file of 1000 records takes some 40-90 seconds.

The initial access to the 6dF Galaxy Redshift Survey offers similar functionality to that described in Section 5.3 see

http://www-wfau.roe.ac.uk/6dFGS/form.html

and

http://www-wfau.roe.ac.uk/6dFGS/SQL.html

Screenshots of some of the examples described above are provided in the Appendix (Section 9).


VIRTUAL OBSERVATORY CONSIDERATIONS

VOTable format will be one of the output options for the V1.0 object catalogue access. In moving towards V2.0 specified functionality the pixel and object catalogue access will be offered as XML/SOAP driven web services that will be published on the Astrogrid registry. Tools and standards are starting to emerge and will be adopted as they are finalised.


HOUSEKEEPING

Temporary output files generated by user requests will be written to a publicly (HTTP) accessible area of the file system. A cron job will run daily and delete any of these files more than 48 hours old.

Log files of user access will be archived and used to monitor and generate statistics on the WSA usage (hits, queries, data volume served).


USER DOCUMENTATION

The WSA website will have extensive online documentation to help the user including:


APPENDIX

This section shows screenshots of some of the ways users can access the SSS and 6dFGS data held at WFAU. Pictures of a web form demonstrating catalogue matching with an SQL database (SDSS EDR held at WFAU) are also provided. See Sections 4.7 and 5.7 for more details.




Figure 4: Screenshot of the SSS pixel web form.
\includegraphics[width=5.8in]{sss_pixel.ps}






Figure 5: Screenshot of results from the SSS pixel web form.
\includegraphics[width=5.8in]{sss_pixel_res.ps}






Figure 6: Screenshot of GAIA accessing the SSS.
\includegraphics[width=5.8in]{gaia.ps}




Figure 7: Screenshot of a 6dFGS web form, the query constructed here is ``find objects that appear in the 2MASS input catalogue that have been observed and that have a J magnitude $<$ 12, a J-K colour $>$ 1.5 and where the quality of the measured redshift is $>=$ 3. Sort output such that brightest K mag appears first. Parameters to be returned are 2MASS RA and DEC, J, K, J-K colour and the observed redshift (z) and quality''.
\includegraphics[width=5.8in]{6df.ps}






Figure 8: Screenshot of results from submitted 6dFGS form shown in Figure 7.
\includegraphics[width=5.8in]{6df_res.ps}






Figure 9: Screenshot of WSA demo of catalogue matching web form.
\includegraphics[width=5.8in]{wsa.ps}



Figure 10: Screenshot of catalogue matching results.
\includegraphics[width=5.8in]{wsa_res.ps}


ACRONYMS & ABBREVIATIONS

6dF : Six-degree field
6dFGS : Six-degree field Galaxy Survey
ADnn : Applicable Document No nn
CGI : Common Gateway Interface
CASU : Cambridge Astronomical Survey Unit
DBD : Database Driver
DBI : Database Interface
DXS : Deep Extragalactic Survey
LWP : LibWWW-Perl
HTML : HyperText Markup Language
HTTP : Hypertext Transfer Protocol
SOAP : Simple Object Access Protocol
SQL : Structured Query Language
SSS : SuperCOSMOS Sky Surveys
UDS : Ultra Deep Survey
UKIDSS : UKIRT Infrared Deep Sky Survey
VISTA : Visible and Infrared Survey Telescope for Astronomy
VPO : VISTA Project Office
W3C : World-Wide Web Consortium
WFAU : Wide Field Astronomy Unit (Edinburgh)
XML : eXtensible Markup Language


APPLICABLE DOCUMENTS


AD01 Science Requirements Analysis Document VDF-WFA-WSA-002

Issue: 1.0 2/04/03

AD02 Database Design Document VDF-WFA-WFCAM-007

Issue 1.0 2/04/03

AD03 WSA Data Flow Document VDF-WFA-WFCAM-005

Issue 1.0 2/04/03



CHANGE RECORD


Issue Date Section(s) Affected Description of Change/Change Request Reference/Remarks
1.0 02/04/03 All New document


NOTIFICATION LIST

The following people should be notified by email whenever a new version of this document has been issued:


WFAU: P Williams, N Hambly
CASU: M Irwin, J Lewis
QMUL: J Emerson
ATC: M. Stewart
JAC: A. Adamson
UKIDSS: S. Warren, A. Lawrence

About this document ...

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 VDF-WFA-WSA-008-I1

The translation was initiated by Mike Read on 2003-04-02


next_inactive up previous
Mike Read 2003-04-02