From Sega Retro

YMX is a voice bank format which contains a collection of YM2612 voices, invented by saxman for use in his Sonic QX program. It has since been implemented in other music-related hacking programs, such as Sonic One Music Editor, xm3smps, and xm4smps. In 2017, saxman released YMX2SYX, a tool that converts YMX banks to SysEx banks supported by Yamaha DX11, DX21, DX27, DX27S, DX100, and TX81Z synthesizers.

Its name is a combination of letters from "YM2612" and "DX", the latter referring to Yamaha's line of FM synthesizers from the '80s.


The official specifications for the YMX format have been released by saxman.

YMX revision 1 format specification
Created by Damian Grove

The original YMX format is sufficient for most uses, but I wanted to update
the format to make it a little more versatile and add support for the GEMS FM
voice format so the YMX format could be extended to non-SMPS games. The new
group feature will allow you to categorize your voices, so if you have voices
from multiple levels or games, or if you want to group together all the types
of instruments (e.g. woodwind, brass, etc.), you can do that. You can also
have up to 65535 voices now, a vast improvement over the 128 limit from the
original specification.

All word and dword values described in this specification are little-endian.
Any implementation of this revision should also support the original format.
SMPS voices are in the Sonic 2 format and are 25 bytes in length. GEMS PSG
voices are not supported. Only the FM-based GEMS voices are supported, and
they are 39 bytes in length. All strings are in ASCII format. Voice and group
names should use 0x20 for empty characters. Strings do not use any special
terminating characters.

Below is the structure of a YMX revision 1 file:

    string[6]       "YM2612"
    byte            revision (0x01)
    word            number of voices (up to 65535 voices)
    byte            number of groups (up to 255 groups)
    dword           pointer to voice pointers
    dword           pointer to group pointers
    bits[8]         (76543210)
                    0 -- Voice format (0=SMPS[S2], 1=GEMS)
                    1-7 -- RESERVED
    byte            length of description
    string[0-255]   bank description

    dword           pointer to voice

    byte            group (0=none)
    byte[25-39]     voice data
    string[10]      voice name
    byte            length of description
    string[0-255]   voice description

    dword           pointer to group

    byte            length of group name
    string[0-255]   group name

Below is the original YMX format:

    string[6]       "YM2612"
    byte            revision (0x00)
    byte            number of voices (0=0xFF, 1=0x00, 2=0x01; up to 128=0x7F)

    byte[25]        voice data (SMPS Sonic 2 format)
    string[10]      voice name