Gedcom object model in C


Index


Main functions

There are two ways to start with a GEDCOM object model (after having called gedcom_init): either by starting from scratch, or by starting from a given GEDCOM file.  This is done via the following two functions:
int gom_parse_file (const char* file_name);
This initializes the object model by parsing the GEDCOM file given by file_name.  It returns 0 on success and 1 on failure.
int gom_new_model ();
This starts an empty model.  Actually, this is done by processing the file "new.ged" in the gedcom-parse data directory.
In the GEDCOM object model, all the data is immediately available after calling gom_parse_file() or gom_new_model().  For this, an entire model based on C structs is used.  These structs are documented here, and follow the GEDCOM syntax quite closely.  Each of the records in a GEDCOM file are modelled by a separate struct, and some common sub-structures have their own struct definition.

The following functions are available to get at these structs:
The XXX stands for one of the following: family, individual, multimedia, note, repository, source, submitter, user_rec.

Object model structure

Object lists

All records of a certain type are linked together in a linked list.  The above functions only give access to the first record of each linked list.  The others can be accessed by traversing the linked list via the next member of the structs.  This means that e.g. the following piece of code will traverse the linked list of family records:
struct family* fam;

for (fam = gom_get_first_family() ; fam ; fam = fam->next) {
  ...
}

The next member of the last element in the list is guaranteed to have the NULL value.

Actually, the linked list is a doubly-linked list: each record also has a previous member.  But for implementation reasons the behaviour of this previous member on the edges of the linked list will not be guaranteed, i.e. it can be circular or terminated with NULL, no assumptions can be made in the application code.

This linked-list model applies also to all sub-structures of the main record structs, i.e. each struct that has a next and previous member follows the above conventions.  This means that the following piece of code traverses all children of a family (see the details of the different structs here):
struct family* fam = ...;

struct xref_list* xrl;
for (xrl = fam->children ; xrl ; xrl = xrl->next) {
  ...
}

Note that all character strings in the object model are encoded in UTF-8 (Why UTF-8?), but see below for how to convert these automatically.

User data

Each of the structs has an extra member called extra (of type struct user_data*).  This gathers all non-standard GEDCOM tags within the scope of the struct in a flat linked list, no matter what the internal structure of the non-standard tags is.  Each element of the linked list has:
This way, none of the information in the GEDCOM file is lost, even the non-standard information.


Modifying the object model

Note that the date manipulations are described here.

Manipulating strings

There are some functions available to retrieve and change strings in the Gedcom object model, depending whether you use UTF-8 strings in your application or locale-defined strings.

The following functions retrieve and set the string in UTF-8 encoding:
char* gom_get_string (char* data);
char* gom_set_string (char** data, const char* str_in_utf8);

The first function is in fact superfluous, because it just returns the data, but it is there for symmetry with the functions given below for the locale-defined input and output.  

The second function returns the new value if successful, or NULL if an error occurred (e.g. failure to allocate memory or the given string is not a valid UTF-8 string).  It makes a copy of the input string to store it in the object model.  It also takes care of deallocating the old value of the data if needed.  Note that the set function needs the address of the data variable, to be able to modify it.  In the case of an error, the target data variable is not modified.

Examples of use of these strings would be, e.g. for retrieving and setting the system ID in the header:
struct header* head = gom_get_header();
char* oldvalue = gom_get_string(head->source.id);
char* newvalue = "My_Gedcom_Tool";

if (gom_set_string(&head->source.id, newvalue)) {
  printf("Modified system id from %s to %s\n", oldvalue, newvalue);
}


A second couple of functions retrieve and set the string in the format defined by the current locale:
char* gom_get_string_for_locale (char* data, int* conversion_failures);
char* gom_set_string_for_locale (char** data, const char* str_in_locale)
;
The use of these functions is the same as the previous ones, but e.g. in the "en_US" locale the string will be returned by the first function in the ISO-8859-1 encoding and the second function expects the str_in_locale to be in this encoding.  Conversion to and from UTF-8 for the object model is done on the fly.

Since the conversion from UTF-8 to the locale encoding is not always possible, the get function has a second parameter that can return the number of conversion failures for the result string.  Pass a pointer to an integer if you want to know this.  You can pass NULL if you're not interested.  The function returns NULL if an error occurred (e.g. if the given string is not a valid string for the current locale); in that case the target data variable is not modified.

Adding and removing records

For each of the record types, there are two functions to add and remove records:
struct XXX*   gom_new_XXX(const char* xref);
int           gom_delete_XXX(struct XXX* obj);

The XXX stands for one of the following: family, individual, multimedia, note, repository, source, submitter, user_rec.

For submission records, the gom_delete_submission() has no parameters (since there can be only one such object anyway).

When creating new records, the application is responsible for making sure that mandatory fields (according to the GEDCOM spec) are filled in afterwards.  In a later release, there will be checks in gom_write_file when something is missing.

Adding, removing and moving cross-references

For struct members that are of type struct xref_value, the following function is available:
struct xref_value*  gom_set_xref(struct xref_value** data, const char* xref);
This function modifies the data variable to point to the given xref, taking care of unreferencing the old value, and referencing the new value.  If an error occurs, NULL is returned (and the data variable is not changed).  If xref is NULL, the data is set to NULL.

For struct members that are of type struct xref_list, the following functions are available:
struct xref_list*   gom_add_xref(struct xref_list** data, const char* xref);
int                 gom_remove_xref(struct xref_list** data, const char* xref);
int                 gom_move_xref(Gom_direction dir,
struct xref_list** data, const char* xref);
The first function adds the given xref to the end of the data list.  The second function removes the given xref from the data list (if present; if not present an error is generated and 1 is returned).

The third function moves the given xref up or down the data list, depending on the dir parameter, which can be:
Again, an error is generated and 1 is returned if the given xref is not part of the list.  If the xref cannot be moved up (because the first in the list) or down (because the last in the list), a warning is generated, but the function still returns success (0).

Adding, removing and moving substructures

For struct members that are just a single value, the following functions are available:
struct XXX*   gom_set_new_XXX(struct XXX** data);
int           gom_delete_XXX(struct XXX** data);

This is the case for XXX equal to address, change_date or place.  The first function creates a new substructure and assigns it to data (NULL is returned if there was already a value).  The second function deletes the value from data.

Note: for change_date structs there is also the following short-cut function, which updates the date and time directly:
int gom_update_timestamp (struct change_date** obj, time_t tval);
For struct members that are a list (as described here), the following functions are available:
struct XXX*   gom_add_new_XXX(struct XXX** data);
int           gom_remove_XXX(struct XXX** data, struct XXX* obj);

int           gom_move_XXX(Gom_direction dir, struct XXX** data, struct XXX* obj);
This is the case for all XXX structs that have a next and previous member.  The first function creates a new substructure and adds it to the end of the data list.  The second function deletes the object from the data list (if present; if not present, an error is generated and 1 is returned).

The third function moves the given obj up or down the data list, depending on the dir parameter, similar to the xref functions above.


Writing the object model to file

Writing the current object model to a file is simply done using the following function:
int gom_write_file (const char* filename, int* total_conv_fails);
This writes the model to the file filename.  The second parameter can return the total number of conversion failures (pass NULL if you're not interested).  The functions in this section can be used before gom_write_file to control some settings.

Before you write the file, you can update the timestamp in the header using the following function:
int gom_header_update_timestamp (time_t tval);
This sets the date and time fields of the header to the time indicated by tval.  The function returns 0 on success, non-zero if an error occurred.  Typically, the function would be used as follows, to set the current time in the timestamp:
int result;
result = gom_header_update_timestamp(time(NULL));



$Id: gom.html,v 1.6 2003/02/02 14:40:05 verthezp Exp $
$Name: R0_90_0 $