hl7parse
Functions
decode.h File Reference

primary hl7 parse methods More...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include "logging.h"
#include "bom.h"
#include "meta.h"
#include "node.h"
#include "util.h"
Include dependency graph for decode.h:

Go to the source code of this file.

Functions

const char * version_parser ()
 parser version information More...
 
void print_error (int e, char *additional)
 pretty print system errors More...
 
unsigned char * extract_substr (int start, int length, unsigned char *buffer)
 extract a sub string from a string More...
 
int parse_segment (FILE *fd, hl7_meta_t *meta, node_t **fieldlist_p, unsigned char **segment_name)
 parse one HL7 line More...
 
int hl7_decode (FILE *fd, message_t **message_p)
 Entry point for the parser. More...
 
message_tdecode (FILE *fd, hl7_meta_t *meta)
 parse hl7 file More...
 
FILE * hl7_open (char *filename)
 open file More...
 
int hl7_close (FILE *fd)
 close file More...
 

Detailed Description

primary hl7 parse methods

high level interface to be exposed to bindings

Function Documentation

◆ decode()

message_t* decode ( FILE *  fd,
hl7_meta_t meta 
)

parse hl7 file

This method gives more control of the metadata than hl7_decode(). You may do your own delimiter detection and store the information in a meta_t struct.

Make sure to set meta_t.crlf to something other than -1 (default) otherwise the parser will try to detect the delimiters again from the first line it is fed.

Parameters
fdfile pointer
metaprepared metadata with meta_t.clrf != -1 to prevent automagic meta detection by hl7_decode()

◆ extract_substr()

unsigned char* extract_substr ( int  start,
int  length,
unsigned char *  buffer 
)

extract a sub string from a string

This method expects the string's length without \0 character. It will allocate a new buffer (including delimiting \0) and copy all data into the new buffer. The buffer wil lthen be delimited with \0.

The user must take care to properly free() the newly allcoated string.

Note
we do not check if you try to read past the end of buffer, you must take care of that!
Parameters
startposition of the substrings first character
lengthstring length without terminating \0
bufferthe srouce string
Returns
new char buffer

◆ hl7_close()

int hl7_close ( FILE *  fd)

close file

Convencience function, mainly used for language bindings

Parameters
fdfile pointer
Returns
return code of fclose()

◆ hl7_decode()

int hl7_decode ( FILE *  fd,
message_t **  message_p 
)

Entry point for the parser.

The parser has 2 modes:

  1. parse first line up until \r or \n is found. Detirmine delimiters, update meta
    See also
    hl7_delimiters()
  2. parse the rest of the file after the first line break, splitting fields into internal data structures

if you want to analyze parts of the document while the parser is still running or you are using it in a multi-threaded environment, then there is the possibility to add callback functions. The following callback hooks are available:

  • void (*cb_progress)(message_t *message, size_t total, size_t current);
  • void (*cb_start)(message_t *message)
  • void (*cb_end)(message_t *message, size_t max, size_t current, int exit_code)
  • void (*cb_segment)(message_t *message, size_t num, char name[3])

Also, you can controll how often the cb_progress() callback is fired, default is every 1% of progress (does not fire on files smaller than 100 bytes).

See also
message_state.h for more information on callback functions.
Parameters
fdFile descriptor, must be forwarded on the last byte of the BOM (if any)
message_pthis must be an initialized message_t object.
Returns
0 on success, 1 of message_p is NULL, 2 if create_node fails on first segment (meta detection), 3 if first segment is not "MSH", > 10 parse_segment() errors

◆ hl7_open()

FILE* hl7_open ( char *  filename)

open file

Convencience function, mainly used for language bindings

Parameters
filenamepath to file
Returns
FILE* or NULL on error

◆ parse_segment()

int parse_segment ( FILE *  fd,
hl7_meta_t meta,
node_t **  fieldlist_p,
unsigned char **  segment_name 
)

parse one HL7 line

Please use read_meta() before using this method on the first line of the HL7 file. It might work without (if the HL7 file uses default delimiters) but is not safe.

This method reads one line of the HL7 file and parses it into a segment structure for node_t's. Meta will be updated once we reach end of line. meta->crlf and meta->sep_message will be updated when the first \n or \r character is found if meta->crlf is -1 (default value, no EOL detection happened yet).

meta->crlf and meta->sep_message will only be updated on the first pass, then cached data will be used to speed up the parsing process.

Note
do not use meta on different files, always use a new meta data structure on a new file

Error codes:

  • 1: failed to allocate memory for line buffer
  • 2: failed to allocate more memory for line buffer
  • 3: EOF was reached unexpectedly
  • 4: maximum delimiters per segment is reached, raise MAX_FIELDS
  • 5: failed to allocate raw_field
    See also
    create_raw_field_t()
  • 6: failed to allocate node_t
    See also
    create_node_t()
  • 7: failed to process sub field elements
    See also
    process_node()
  • 8: failed to append child
    See also
    node_append()
  • 9: failed to allocate memory for segment name
  • 10: File did not start with 'MSH'
  • 11: MSH-2 was not delimited by MSH-1, FIXME: these error codes may overlap where we use ret+10 now!
  • 12: Segment name is longer than 5 bytes
Parameters
fdFile descriptor, must be forwarded on the last byte of the BOM (if any)
metamust be initialized at least with correct delimimters. this is typically done by parse_msh()
fieldlist_pthis is the node tree of the segment
segment_namewill hold the segment name, eg. "PID", "OBX", ...
Returns
0 on success, else error code
Todo:
we fail on empty lines, deal with them

automatic metadata detection

It is safest to first run read_meta() before calling parse_segment(). If you know that this is an MSH segment (and only then) it is safe to skip read_meta() prior to running parse_segment().

If read_meta() is not run, then we we go in meta detection mode and expect the field separator at position 3 of this line.

Not sure if this is really a good idea? We assume that the segment name is always 3 characters which seems to be true most of the time (always?).

You may override this by first manually detecting all separators and feeding parse_segment with a complete metadata structure.

Note
we only jock here if read_meta() has not been used
Bug:
if crlf is -1 and the first 3 characters are not MSH, then we have to trust meta or stop here

◆ print_error()

void print_error ( int  e,
char *  additional 
)

pretty print system errors

Deprecated:
replace with log_error()
Parameters
eError code
additionalExtra string to print

◆ version_parser()

const char* version_parser ( )

parser version information

Returns
version string as defined in CMakeLists.txt