                       How to get informations about file formats
                                   by Guenter Born

This text has been written in the hope that it may helpful for thouse seeking
informations about file formats. The text may be distributed free under
the following terms:

 The text must be distributed unmodified and in the whole part.
 The text comes AS IS without any warranty.

--------------
Abstract:
This file gives a short overview over the file formats used an pc and
show ways how to obtain more informations about the internals of
popular file formats.
--------------
Today, each applications program uses its own vendor-specific
format to store data. End users are confused by the amount of extensions
which are used by several applications.

Sometimes different applications using the same extension for different
formats. In this case it is hard, do decide what a file contains. The
following list gives a short overview about formats used on different
platforms.

EPS   Encapsulated PostScript, this format is used to describes graphic
      and text. PostScript-interpreters may display this files. Some
      programs stores a reduced image in the TIF or WMF format in the EPS
      header. This allows programs like WinWord to display this
      image as a placeholder in documents. The informations in a EPS
      file are stored in the ASCII format.
DXF   Drawing Exchange Format: This file type is used by AutoCAD to export
      and import CAD-drawings. The format represents drawings in a
      vector format. Each command is stored as a ASCII strings
      (there is a modified binary version available).
DHP   Dr. Halo Paint: This is a binary graphics format, used by
      Dr. Halo, to store raster graphic images
CGM   Computer Graphic Metafile: This is a metafile format to store
      graphics, the format is standarized by the ISO and is supported
      by several programs. The format allows several representations
      (ASCII, binary) of graphic data.
IGES  Initial Graphic Exchange Standard: this is a graphic vector format,
      used by CAD Programs for data export and import.
IMG   This is a format used by the GEM-GUI to store (monochrome) pixel
      graphics. The format is supported by Ventura Publisher and some
      conversion utilities (PaintShop Pro).
GEM   This is the graphic metafile format used by the GEM GUI.
HPGL  This is the Hewlett Packard Graphic language, used to send
      graphics to HP plotters. Many programs provide a printer output
      for HPGL files. Some programs can import drawings in this format.
TIFF  This is the Tag Image File Format, used widely PC and MAC
      platforms to store graphic images (TIF extension on PC).
      The file format was defined by the TIFF standard (now
      Version 6.0) created by serveral companies (Microsoft, Hewlett
      Packard, Aldus etc.).
      The TIFF standard defines many different methods to store a graphic
      in a file. Many programs support only parts of the TIFF standard.
PNTG  This is a format used on the MAC to store graphics (MAC Paint)
RIFF  This is a format defined by Microsoft to store multimedia files
      (AVI, WAV etc.) an the Windows platforms. Letraset uses the same
      extension for the Raster Image File Format to store a compressed TIFF
      variant on the Macintosh.
PCX   A simple format to store bitmap graphics. The PCX format was very popular
      because it is used by the ZSoft program PaintBrush (which is deliverd with
      Windows 3.x).
WPG   This format was defined by WordPerfect (now Novell) to store graphics for
      WordPerfect products. This format contains metafile and bitmap informations.
PIC   This extension is used by several programs to store different informations.
      In Lotus 1-2-3 a PIC file keep informations about a graphic. Dr. Halo uses
      this extension to store bitmap graphics. The formats are not compatible.
      Micrografx uses also the PIC extension to store graphics in files.
DRW   Format used by Micrografx to store graphics.
GRF   Format used by Micrografx to store graphics.
AI    Adobe Illustrator format to store graphics (this is a format similar to
      postscript).
FLI   Animation file format defined by Autodesk for display animated images.
FLM   Extension of the FLI format.
MSP   This format was used by MS-Paint in Windows 2.0.
GIF   This format was defined by CompuServe to store graphic files.
BMP   This is a format defined in Windows and OS/2 to store bitmap graphics.
      Unfortunately OS/2 comes since version 2.0 with a modfied version which
      isn't compatible with the old Windows BMP format.
RLE   This is nothing else as a Windows BMP-graphic format which stores the
      image data in a compressed variant.
WMF   The Windows Metafile Format (WMF) is used in MS-Windows to store
      images in a Metafile format. In Windows 95 a new Enhanced Metafile
      Format (EMF) is introduced.
IFF   This format is widly used on  the AMIGA platform to store graphics, text,
      music and so on. On the PC platform the IFF variant is used to store
      graphics. A modified variant AIFF (Audio IFF) is used on the Macintosh
      and on the PC (LBM extension) to store sound data.
CUT   This format is used by Dr. Halo to store color map files.
PCT   Graphic format used on the MAC (MAC Pict).
TGA   The TARGA graphic file format was defined by TrueVision to store image
      data (Bitmaps in different ways: color map, RGB).
JFIF  Exchange format for JPEG images

MID   MIDI association file format to store sound for MIDI devices
WAV   This format was defined by Microsoft to store and play sounds under
      Windows.
AVI   This format is used unter Windows to play AVI-Videos.
QTM   Apples Quick Time Format was developed to play videos on a MAC and PC.
DVI   Intels Digital Video Format.
MOD   Format to store and play sound

BIFF  This is the format used in Excel to store spreadsheets
WKx   This formats are used in Lotus 1-2-3 and Symphony to store spreadsheets
DOC   Format used in WinWord to store text documents
RTF   Rich Text Format, defined by Microsoft as the interchange format for
      Windows
SYLK  Exchange format defined by Microsoft as an interchange format for Excel
DIF   Exchange format for Spreadsheet data.

The list above discusses only a small part of the formats, used on the different
computer platforms. For graphic files some conversion routines are available to
convert the data into other formats. The shareware programs like PaintShop,
Graphic Workshop, Alchemy offer the ability to read and write different bitmap
graphic formats. The program HIJACK can convert over 70 graphic formats (some of
this are vector formats).

Informations for developers
-------------------------------
Yet end-users demand that programs accept data files that may have been created
by totally different and sometimes competing programs. Software developers in
particular depend on the file format information of different applications
programs to implement advanced import and export functions. Unfortunately, most
of the information about file formats is confidential, not well-documented, or
not available for public use.

The biggest problem is, how to obtain informations about the internals of a file
format. Some source could be the CompuServe Library areas. Searching with the IBM
File Finder results in several entries. The formats of some graphic files may be
found in the GRAPHICS forum. Other sources are the forums from Microsoft,
AutoDesk, Lotus etc. An alternativ should be the Internet. Unfortunately there
is no structure to get all the needed Informations quick and easy. A single
source for different file formats is badly needed. Since 1995 some books are
available in English which deals with this subject.

a) The first book was published by O'Reilly and discusses several
   graphic file formats:

    J.D. Murray, William VanRyper: Graphics File Formats, 890 pages, 1994,
    ISBN 1-56592-058-9

This book is diveded in two parts. Part 1 discusses only prinziples:
introduction into the computer graphics, describes the reason why bitmap,
metafiles and vektor formats a needed and give some impressions about different
graphic devices. Part 2 contains a (sometimes short) description of
different graphic file formats (from PCX, TIFF, TARGA to other rare used
formats). A CD-ROM contains additional material from different Internet
sources.

b) The second source was published first in 1989/90 in the German language
(Referenzhandbuch Dateiformate,800 pages, Addison Wesley, Germany,
ISBN 3-89319-815,6). (A Russian version with 700 pages is available from
BHV publishing Kiew, ISBN 5-87685-023-3)

The English version is availabe since 1995 from International Thomson
Publishing and Van Nostrad Rheinhold:

    Guenter Born: The File Formats Handbook, 1274 pages, 1995, International
    Thomson Publishing, London, ISBN 1-85032-117-5

This book contain descriptions of fileformats from areas of databases,
spreadsheets, word processing, graphics, sound and multimedia. The book
is written for developers, consultants, researchers and students. Below
is a short list with the format names, discussed in this book
-------
The File Format Handbook

PART 1 Database file formats
dBASE II (DBF files, Index file structure, MEM)
dBASE III (DBF, Clipper NTX, NDX , MEM, DBT, LBL)
dBASE IV (DBF, DBT)
FoxPro (DBF, DBT, FPT, IDX, CDX, LBX)
SDF
PART 2 Spreadsheet formats
LOTUS 1-2-3 WKS/WK1 file format
LOTUS 1-2-3 WK3 file format
LOTUS 1-2-3 FRM file format
LOTUS 1-2-3 PIC format
LOTUS Symphony format
Data Interchange Format (DIF)
Super Data Interchange format (SDI)
Standard Interface format (SIF)
Symbolic Link Format (SYLK)
Excel binary interchange format (BIFF)

PART 3 Word processing formats </B><P> 
MS-Word format
WordStar format
WordPerfect format
Rich Text format (RTF version 1.2)
Standard Generalized Markup Language (SGML)
AMI Pro version 3.0/4.0 file format

Part 4 Graphic Formats
Paintbrush format (PCX)
CAPTURE File Format (SCR)
GEM Image format (IMG)
GEM Metafile format (GEM)
Interchange File Format (IFF)
Graphics Interchange format (GIF)
Tag Image File Format (TIFF)
Computer Graphic Metafile format (CGM)
WordPerfect Graphic format (WPG)
AutoCAD Drawing Exchange format (DXF)
Micrografx formats (PIC, DRW, GRF)
TARGA format (TGA)
Dr. Halo format (PIC, CUT, PAL)
SUN Raster format (RAS)
Adobe Photoshop format (PSD)
PCPAINT/Pictor format (PIC)
JPEG/JFIF format (JPG)
MAC-Paint format (MAC)
MAC-Picture format (PICT)
Atari NEOchrome format (NEO)
NEOchrome Animation format (ANI)
Animatic Film format (FLM)
ComputerEyes Raw Data format (CE1,CE2)
Cyber Paint Sequence format (SEQ)
Atari DEGAS format (PI*,PC*)
Atari Tiny format (TNY, TN*)
Atari Imagic Film/Picture format (IC*)
Atari STAD format (PAC)
Autodesk Animator format (FLI)
Autodesk 3D Studio format (FLC)
Amiga Animation format (ANI)
Audio/Video Interleaved format (AVI)
Intel Digital Video format (DVI)
Apple QuickTime format (QTM)
CAS Fax format (DCX)
Adobe Illustrator format (AI)
Initial Graphics Exchange Language (IGES)

Part 5 Windows and OS/2 file formats
Windows 2.0 Paint format (MSP) 
Windows 3.x BMP and RLE format
OS/2 Bitmap format (BMP, version 1.2)
OS/2 Bitmap format (BMP, version 2.x)
Windows Icon format (ICO)
Windows Metafile format (WMF)
Write binary format (WRI)
Windows 3.x Calendar format (CAL)
Windows Cardfile format (CRD)
Clipboard format (CLP)
Windows 3.x group files (GRP)

PART 6 Sound formats
Creative Music Format (CMF)
Soundblaster Instrument format (SBI)
Soundblaster Instrument Bank format (IBK)
Creative Voice format (VOC)
Adlib Music format (ROL)
Adlib Instrument Bank format (BNK)
AMIGA MOD format
Audio IFF format (AIFF) 
Windows WAV format
Standard MIDI format (SMF)
NeXt/Sun Audio format

PART 7 Page description languages
Hewlett Packard Graphic Language (HP-GL/2)
Hewlett Packard Printer Communication Language (PCL)
Encapsulated PostScript format (EPS) version 3.0

