Skip to content

Commit

Permalink
fits: FITS TableBuilders are now self-documenting
Browse files Browse the repository at this point in the history
FitsTableBuilder and ColFitsTableBuilder now extend DocumentedTableBuilder.
  • Loading branch information
mbtaylor committed Jan 10, 2021
1 parent d3bd32c commit 80fffad
Show file tree
Hide file tree
Showing 5 changed files with 144 additions and 14 deletions.
6 changes: 6 additions & 0 deletions fits/build.xml
Original file line number Diff line number Diff line change
Expand Up @@ -403,6 +403,12 @@
<exclude name="**/README*"/>
</javac>

<!-- Copy extra files that should live with packages classes
! (i.e. are discovered using "getResource()"). -->
<copy todir="${build.classes}">
<fileset dir="${src.dir}/resources"/>
</copy>

</target>

<!--
Expand Down
21 changes: 15 additions & 6 deletions fits/src/main/uk/ac/starlink/fits/ColFitsTableBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
import nom.tam.util.ArrayDataInput;
import uk.ac.starlink.table.StarTable;
import uk.ac.starlink.table.StoragePolicy;
import uk.ac.starlink.table.TableBuilder;
import uk.ac.starlink.table.TableFormatException;
import uk.ac.starlink.table.TableSink;
import uk.ac.starlink.table.formats.DocumentedTableBuilder;
import uk.ac.starlink.util.DataSource;

/**
Expand All @@ -30,7 +30,7 @@
* @author Mark Taylor
* @since 26 Jun 2006
*/
public class ColFitsTableBuilder implements TableBuilder {
public class ColFitsTableBuilder extends DocumentedTableBuilder {

private final WideFits wide_;

Expand All @@ -48,17 +48,14 @@ public ColFitsTableBuilder() {
* use null to avoid use of extended columns
*/
public ColFitsTableBuilder( WideFits wide ) {
super( new String[] { "colfits" } );
wide_ = wide;
}

public String getFormatName() {
return "colfits-basic";
}

public boolean looksLikeFile( String location ) {
return location.toLowerCase().endsWith( ".colfits" );
}

public void streamStarTable( InputStream in, TableSink sink, String pos )
throws TableFormatException {
throw new TableFormatException( "Can't stream from colFITS format" );
Expand Down Expand Up @@ -97,4 +94,16 @@ public StarTable makeStarTable( DataSource datsrc, boolean wantRandom,

return new ColFitsStarTable( datsrc, hdr, pos, false, wide_ );
}

public boolean canStream() {
return false;
}

public boolean docIncludesExample() {
return false;
}

public String getXmlDescription() {
return readText( "ColFitsTableBuilder.xml" );
}
}
24 changes: 16 additions & 8 deletions fits/src/main/uk/ac/starlink/fits/FitsTableBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@
import uk.ac.starlink.table.QueueTableSequence;
import uk.ac.starlink.table.StarTable;
import uk.ac.starlink.table.StoragePolicy;
import uk.ac.starlink.table.TableBuilder;
import uk.ac.starlink.table.TableFormatException;
import uk.ac.starlink.table.TableSink;
import uk.ac.starlink.table.TableSequence;
import uk.ac.starlink.table.Tables;
import uk.ac.starlink.table.formats.DocumentedTableBuilder;
import uk.ac.starlink.util.Compression;
import uk.ac.starlink.util.DataSource;
import uk.ac.starlink.util.IOUtils;
Expand All @@ -51,7 +51,8 @@
*
* @author Mark Taylor (Starlink)
*/
public class FitsTableBuilder implements TableBuilder, MultiTableBuilder {
public class FitsTableBuilder extends DocumentedTableBuilder
implements MultiTableBuilder {

private static final Logger logger =
Logger.getLogger( "uk.ac.starlink.fits" );
Expand All @@ -72,6 +73,7 @@ public FitsTableBuilder() {
* use null to avoid use of extended columns
*/
public FitsTableBuilder( WideFits wide ) {
super( new String[] { "fit", "fits" } );
wide_ = wide;
}

Expand All @@ -82,12 +84,6 @@ public String getFormatName() {
return "FITS";
}

public boolean looksLikeFile( String location ) {
String loc = location.toLowerCase();
return loc.endsWith( ".fit" )
|| loc.endsWith( ".fits" );
}

/**
* Creates a StarTable from a DataSource which refers to a FITS
* file or stream. If the source has a position attribute, it
Expand Down Expand Up @@ -285,6 +281,18 @@ public void streamStarTable( InputStream istrm, TableSink sink,
}
}

public boolean canStream() {
return true;
}

public boolean docIncludesExample() {
return false;
}

public String getXmlDescription() {
return readText( "FitsTableBuilder.xml" );
}

/**
* Attempts to convert the HDU starting at the current position in
* an input stream into a table, writing it to a given sink.
Expand Down
29 changes: 29 additions & 0 deletions fits/src/resources/uk/ac/starlink/fits/ColFitsTableBuilder.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<p>As well as normal binary and ASCII FITS tables, STIL supports
FITS files which contain tabular data stored in column-oriented format.
This means that the table is stored in a BINTABLE extension HDU,
but that BINTABLE has a single row, with each cell of that row
holding a whole column's worth of data. The final (slowest-varying)
dimension of each of these cells (declared via the <code>TDIMn</code> headers)
is the same for every column, namely,
the number of rows in the table that is represented.
The point of this is that all the cells for each column are stored
contiguously, which for very large, and especially very wide tables
means that certain access patterns (basically, ones which access
only a small proportion of the columns in a table) can be much more
efficient since they require less I/O overhead in reading data blocks.
</p>

<p>Such tables are perfectly legal FITS files,
but general-purpose FITS software
may not recognise them as multi-row tables in the usual way.
This format is mostly intended for the case where you have a large
table in some other format (possibly the result of an SQL query)
and you wish to cache it in a way which can be read efficiently
by a STIL-based application.
</p>

<p>For performance reasons, it is advisable to access colfits files
uncompressed on disk. Reading them from a remote URL, or in gzipped form,
may be rather slow (in earlier versions it was not supported at all).
</p>

78 changes: 78 additions & 0 deletions fits/src/resources/uk/ac/starlink/fits/FitsTableBuilder.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<p>FITS is a very well-established format for storage of
astronomical table or image data
(see <a href="https://fits.gsfc.nasa.gov/">https://fits.gsfc.nasa.gov/</a>).
This reader can read tables stored in
binary (<code>XTENSION='BINTABLE'</code>) and
ASCII (<code>XTENSION='TABLE'</code>) table extensions;
any image data is ignored.
Currently, binary table extensions are read much more efficiently
than ASCII ones.
</p>

<p>When a table is stored in a BINTABLE extension in an uncompressed
FITS file on disk, the table is 'mapped' into memory;
this generally means very fast loading and low memory usage.
FITS tables are thus usually efficient to use.
</p>

<p>Limited support is provided for the semi-standard
<a href="https://healpix.sourceforge.io/data/examples/healpix_fits_specs.pdf"
>HEALPix-FITS</a> convention;
such information about HEALPix level and coordinate system is read
and made available for application usage and user examination.
</p>

<p>A private convention is used to support encoding of tables with
more than 999 columns (not possible in standard FITS);
this was discussed on the FITSBITS mailing list in July 2017
in the thread
<a href="https://listmgr.nrao.edu/pipermail/fitsbits/2017-July/002967.html"
>BINTABLE convention for >999 columns</a>.
</p>

<p>Header cards in the table's HDU header will be
made available as table parameters.
Only header cards which are not used to specify the table format itself
are visible as parameters (e.g. NAXIS, TTYPE* etc cards are not).
HISTORY and COMMENT cards are run together as one multi-line value.
</p>

<p>Any 64-bit integer column with a non-zero integer offset
(<code>TFORMn='K'</code>, <code>TSCALn=1</code>, <code>TZEROn&lt;&gt;0</code>)
is represented in the read table as Strings giving the decimal integer value,
since no numeric type in Java is capable of representing the whole range of
possible inputs. Such columns are most commonly seen representing
unsigned long values.
</p>

<p>Where a multi-extension FITS file contains more than one table,
a single table may be specified using the position indicator,
which may take one of the following forms:
<ul>
<li>The numeric index of the HDU. The first extension
(first HDU after the primary HDU) is numbered 1.
Thus in a compressed FITS table named "<code>spec23.fits.gz</code>"
with one primary HDU and two BINTABLE extensions,
you would view the first one using the name "<code>spec23.fits.gz</code>"
or "<code>spec23.fits.gz#1</code>"
and the second one using the name "<code>spec23.fits.gz#2</code>".
The suffix "<code>#0</code>" is never used for a legal
FITS file, since the primary HDU cannot contain a table.
</li>
<li>The name of the extension.
This is the value of the <code>EXTNAME</code> header in the HDU,
or alternatively the value of <code>EXTNAME</code>
followed by "<code>-</code>" followed by the value of <code>EXTVER</code>.
This follows the recommendation in
the FITS standard that <code>EXTNAME</code> and <code>EXTVER</code>
headers can be used to identify an HDU.
So in a multi-extension FITS file "<code>cat.fits</code>"
where a table extension
has <code>EXTNAME='UV_DATA'</code> and <code>EXTVER=3</code>,
it could be referenced as
"<code>cat.fits#UV_DATA</code>" or "<code>cat.fits#UV_DATA-3</code>".
Matching of these names is case-insensitive.
</li>
</ul>
</p>

0 comments on commit 80fffad

Please sign in to comment.