documentation for binary files

documentation for binary files


Compressed files and compiled code are considered to be binary files, or binaries, because the need to be translated back to uncompressed in order to be read by a person, or are expected to be interpreted by something other than a person.

Sometimes a binary file is formatted with a table of contents of its various parts at its beginning, f.e. OpenType fonts or compressed files. That is used for referencing those parts by the byte offsets within the file. (Notably, some plain text formats have a table of contents of its parts, such as PDF (portable document format) does by using byte offsets at the end rather than beginning.)

inserting documentation within a binary file

Documentation can be included within a binary file when the binary file has a table of contents with byte offsets. The documentation can be inserted in between the sections, and the byte offsets adjusted to reference beyond those insertions, thereby the insertions are ignored by the binary interpreters.

Regardless of whether those insertions of documentation are listed in the table of contents, they will likely be ignored also by editor programs unaware of such unorthodox insertions and then left out when rewriting the file with new binary edits. In other words, the documentation will be lost unless an editor program for the binary is capable of handling the unorthodox insertions.

external documentation for a binary file

A hyperlink can have any protocol. The HTML viewer merely needs the ability to use the protocol specified, even if only sending the text specified by the protocal to a program that can do something with the text. For example, sending the text specified with an "email:" protocol to an email program.

Documenting a binary file can be done externally to it by referencing the various parts within it by hyperlinks, such as using a protocol referencing a file by byte offsets.

A protocol is simply a set of guidelines, anything desired. Therefore, a protocol for a specific file format can declare names for the parts of specific binary format. Those named parts could be referenced reliably, even when the contents of those parts have changed and thereby their new sizes have shifted their locations within the file (t.i. their offsets).

An HTML viewer can replace the <img> mark with the image referenced in its src attribute, or replace an <iframe> with its referenced document. Similarly, a part referenced by a protocol for a binary format could be inserted when sourced by an <iframe>. Perhaps interpreted from its binary, too, much like a PDF is interpreted for an <img> mark; maybe by an option specified with the protocol.

It might be possible to have a generic protocol for binary formats using the same approach for their table of contents, though probably with less specific references.


begin