Parser Settings

Related Topics:  Load Mode  |  Import data using a Batch Script  |  Measurement Properties  |  Presets

Overview

The parser settings control the aspects of data import that are not covered by the column specifications; in particular how the data file should be converted into lines and columns, whether header or footer lines are present and how to react in the event that instance names match things already in the database.

Text Encoding

Specifies which character encoding is used in the file. Files without any special characters (such as µ or α) are usually in US-ASCII format, but other formats with a wider range of characters also exist. The most commonly seen non-ASCII format is UTF which comes in either 8- or 16-bit variants.

It is not generally possible to determine automatically which text encoding is being used, so if you encounter problems when reading data in an unusual format, you should check the documentation for the software which produced the file, or work though the various possible encodings until the "View file contents" preview shows something recognisable.

Detecting Columns

The column delimiter specifies one or more characters which are used to separate columns. The default setting is \t, which represents the TAB character.

The line prefix identifies one or more characters which occur at the start of every line and which should be ignored.

Similarly, the line suffix identifies one or more characters which occur at the end of every line and which should be ignored.

If, for example, the data in each column has been wrapped up in pairs of " characters, like this:

  "item1", "item2", "item3"

then the values can be extracted by setting the column delimiter to "," and the line prefix and line suffix to "

Note that it is also possible to use a column specification modifier (namely "{filter="unquote"}" ) to remove pairs of single- or double-quotes from individual values.

Ignoring Lines

It is possible to ignore a fixed number of lines at the start or end of the file, or to restrict processing to lines that fall within a pair of regular expressions which identify the start and end of the data of interest. Blank lines or lines that start with a particular sequence of characters can also be ignored.

Lines which are not ignored will be processed, i.e. they will be converted into columns and passed into the loading process for possible addition to the database.

The following parameters are used to select which lines should be ignored (and therefore which lines are processed):

Ignore the first ... lines

The specified number of lines, counting from the top of the file, will be ignored.

Ignore the last ... lines

The specified number of lines, counting from the bottom of the file, will be ignored.

Ignore lines until ... seen

All lines before and including the first one to match this regular expression will be ignored.

Ignore lines once ... seen

Once a line which matches this regular expression is seen, that line and all subsequent lines will be ignored.

Ignore lines beginning with ...

If this option is selected, then any lines which start with the specified sequence of characters will be ignored.

Ignore blank lines

If this option is selected, then all lines which are blank will be ignored (otherwise such a line will cause an error).

Note that each of the above parameters has a checkbox toggle switch to the left of it which must be swictched on in order for the parameter be be active.

Name Matching

The Name Matching controls determine what to do when the name assigned to an instance being loaded matches the name of one or more instances already in the database.

There are three possible name matching modes:

CreateNew

A new instance will be created, even if an instance with a matching name already exists. This is the default setting.

UpdateFirst

The first instance (in chronological order of loading) with a matching name will be updated. In normal circumstances this will correspond to the matching instance which was created before any of the other matching instances. If there are no instances with a matching name, then a new one will be created.

UpdateLast

The last instance (in chronological order of loading) with a matching name will be updated. In normal circumstances this will correspond to the most recently created instance with a matching name. When updating existing instances with new information, this is likly to be the desired setting. If there are no instances with a matching name, then a new one will be created.

When you wish to updating existing instances with new information, the correct setting will probably be UpdateLast as it ensures that the most recently created matching instance is chosen.

Note that in both UpdateFirst and UpdateLast mode, if there are no instances with a matching name, then a new instance will be created.

Presets

Once the correct settings for importing data from a specific file format have been divined it is useful to be able to save them for future reuse; this is the purpose of presets.

Presets allow you save and recall a complete set of parser settings and data settings conveniently. Presets are stored in simple text files which can be easily shared and communicated between users.

Full details are here.