|
hl7parse
|
find unicode bom More...
#include <stdio.h>#include <stdlib.h>#include <string.h>

Go to the source code of this file.
Classes | |
| struct | bom_t |
| Byte Order MArk (BOM) information of a file. This struct is created by detect_bom() More... | |
Typedefs | |
| typedef struct bom_t | bom_t |
| Byte Order MArk (BOM) information of a file. This struct is created by detect_bom() | |
Enumerations | |
| enum | bom_endianness_t { UNKNOWN, LITTLE, BIG, SIGNATURE } |
| endianness detected in bom More... | |
Functions | |
| char * | bom_to_string (int length, unsigned char *bom, bom_endianness_t endianness) |
| hex representation of the bom More... | |
| void | print_bom (bom_t *bom) |
| debug function to print bom More... | |
| bom_t * | detect_bom (FILE *fd) |
| check if the file has a bom More... | |
find unicode bom
When parsing an HL7 file, the opened file pointer should be at the beginning of data (typically just at the beginning of MSH).
If the file contains a unicode BOM, and the file pointer points at the beginning of the file, the parser will fail. Therefore we first must skip the BOM bytes.
This is a crude method of detecting if the file has a BOM. Alternatively you may deploy you own method and just skip ahead until you know the file pointer is at the first character of data (at the beginning of MSH) before parsing the file.
we try to detect known BOM patterns and then place the pointer just after it. known patterns:
2 Bytes
0xFF 0xFE0xFE 0xFF3 Bytes
0xEF 0xBB 0xBF0xF7 0x64 0x4C0x0E 0xFE 0xFF0xFB 0xEE 0xFF4 Bytes
0x2B 0x2F 0x76 // Followed by 38, 39, 2B, or 2F (ASCII 8, 9, + or /), depending on what the next character is.0x00 0x00 0xFF 0xFF0xFF 0xFE 0x00 0x000xDD 0x73 0x66 0x730x84 0x31 0x95 0x33| enum bom_endianness_t |
| char* bom_to_string | ( | int | length, |
| unsigned char * | bom, | ||
| bom_endianness_t | endianness | ||
| ) |
hex representation of the bom
| length | lenght of input buffer |
| bom | byte array with the bom |
| endianness | endianness to display |
| bom_t* detect_bom | ( | FILE * | fd | ) |
check if the file has a bom
if there is a bom, it will be copied to bom->bom. The file pointer will be set to the first character after the bom.
To check if a bom has been detected, bom->length is greater than 0. Length represents the number of bytes bom->bom contains.
| fd | file handle to read data from |
bom->bom, length is indicated by bom->length | void print_bom | ( | bom_t * | bom | ) |
debug function to print bom
| bom |
1.8.13