BDOSE v1.1 file format
Header
SNP identifiers from Z file (sequence of NSNPs blocks)
Sample IDs (sequence of Nsamples blocks)
Dosage data offsets
Dosage data (sequence of NSNPs blocks)
Bytes | Description |
8 | Magic number (bdose1.1) |
4 | Unsigned integer indicating the length LBGEN_filename in bytes of the BGEN filename used to generate the BDOSE file |
LBGEN_filename | Name of the BGEN file |
8 | Unsigned integer indicating the size SBGEN_file of the BGEN file in bytes |
Min(1000, SBGEN_file) | First bytes of the BGEN file |
8 | Unsigned integer indicating the size SBDOSE_file of the BDOSE file in bytes |
4 | Unsigned integer indicating the number of samples NSamples in the BDOSE file |
4 | Unsigned integer indicating the number of SNPs NSNPs in the BDOSE file |
1 | Unsigned integer representing the compression level Clevel used in the BDOSE file |
Clevel = 0 indicates 2 bytes | |
Clevel = 1 indicates 4 bytes | |
Clevel = 2 indicates 8 bytes | |
Clevel = 3 indicates 1 byte | |
8 | Unsigned integer indicating the start position of the sample IDs in the BDOSE file |
8 | Unsigned integer indicating the start position of the dosage data offsets in the BDOSE file |
8 | Unsigned integer indicating the start position of the dosage data in the BDOSE file |
Bytes | Description |
4 | Unsigned integer indicating the length LBlock of the SNP identifier block in bytes |
4 | Unsigned integer indicating the line in which the SNP appears in the Z file |
2 | Unsigned integer indicating the length Lrsid of the entry in column rsid of the Z file in bytes |
Lrsid | Entry in column rsid of the Z file |
4 | Unsigned integer indicating the entry in column position of the Z file |
2 | Unsigned integer indicating the length Lchromosome of the entry in column chromosome of the Z file in bytes |
Lchromosome | Entry in column chromosome of the Z file |
4 | Unsigned integer indicating the length Lallele1 of the entry in column allele1 of the Z file in bytes |
Lallele1 | Entry in column allele1 of the Z file |
4 | Unsigned integer indicating the length Lallele2 of the entry in column allele2 of the Z file in bytes |
Lallele2 | Entry in column allele2 of the Z file |
LBlock = 20 + Lrsid + Lchromosome + Lallele1 + Lallele2 number of bytes for the SNP identifier block |
Bytes | Description |
4 | Unsigned integer indicating the length Lsample_ID of the sample ID in bytes |
Lsample_ID | Sample ID |
Bytes | Description |
8 × NSNPs | Unsigned integers indicating the start position of compressed dosages data for each SNP |
Bytes | Description | |
4 | Unsigned integer indicating the size Scompressed of the compressed dosage data in bytes | |
4 | Unsigned integer indicating the size Suncompressed of the uncompressed dosage data in bytes | |
Scompressed - 4 | Zstandard compressed dosage data in integer format | |
Missing values are coded as follows | ||
y = 53248 | if Clevel = 0 | |
y = 3489660928 | if Clevel = 1 | |
y = 14987979559889010688 | if Clevel = 2 | |
y = 208 | if Clevel = 3 | |
Convert a dosage value from integer format to floating-point format with the following transformation | ||
x = 2(2 - 8 × Nbytes ) × y | ||
See Clevel for the number of bytes Nbytes |