Parsing data with help of regular expressions
Advanced Serial Data Logger you can download here. Plugins can be downloaded separately here.
I have a piece of equipment that transmits serial data to a terminal program
(data showed below), in this data is the position of the data on the screen,
followed by the actual data. I want to only save the data (marked
by red color) and not the positional information, can this be done (data blocks
separated by green color)? Also can the data be
sent directly to Excel?
Fig.1. Serial data captured
It is assumed that:
You've configured communication parameters (baud rate, number of data bits, flow control etc) in the data logger and can receive any data without communication errors.
The image above shows, that the data flow contains non-printable characters (squares
on the image above) and doesn't show ending characters of a data packet. We need to recognize ending characters of an each data record. Please, enable display output for non-printable characters with a character code below than 0x20h. Please,
set the following options.
Fig.2. Serial data view setup
Then enable the "Wrap words" option (fig.3), because a data block is very
large and doesn't fit in the program window. You can open the dialog window below by selecting the "Options - Program options" menu item.
Fig.3. Data logger window view
Then click the "OK" button and try to receive data from a port. You should receive the data like on the figure below.
Fig.4. Data logger. Data received.
This is the other view of the data received. Notice that all non-printable characters were replaced with their code i.e. #1B. It is clear from the data screen above, that the data block (within green brackets) starts with ":sr#1B" and ends with #0A#0D#0A (underlined by red). In this example, our data block contains two parts namely, 1 and 2. The part #2 was adjusted to the width of the window. Both of these parts should be interpreted as a single whole.
We are now ready for configuring of the modules. First, select the "ASCII data query and parser" plug-in (Fig.5, pos. #1) from a drop-down list. Then, enable a parsing option for data received (fig. 5a, pos. 2) and select necessary data export plug-ins. The DDE server (fig.5b, pos.3) will help us to check that the data packet is parsed and exported. The "Local database" plug-in will create Microsoft Excel files.
Fig.5a. Data logger. The data parser plug-in setup.
Fig.5b. Data logger. Data export plug-ins setup.
Now, please, open a configuration window of the ASCII parser and query plug-in by clicking the "Setup" button near the drop-down box (fig.5a, pos.1). The dialog window will be showed on the desktop (fig.6).
Fig.6. Data logger. The data parser configuration window.
The configuration process should be very simple if you have examined your data flow in the data logger window (Fig.4). You should type in the same as in the data logger window in fields 1 and 2. Field #1 marks the beginning of the data block and the field #2 marks the end. The values to be typed in here are as underlined in red in Fig.4 above. In the field #3 you should specify a timeout value, which will be used if the module will not receive ending characters in the specified interval.
Because our data flow contains
data packets, that we don't want to export, then we've added two filter rules (fig.7). You can add new rules by clicking the "Add item" button. In our case we are excluding all data packets that contain strings "Data" or "data" (characters case is important in the "Expression" field).
Fig.7. Data logger. Filter rules.
The next page is a very important part of the parser configuration. The data parser uses this information for data extraction from the data packet. In the example, the data block contains few data items (see fig.1) namely: Date 1-Date 12, which should be separated out to different variables. Later, these variables will be used in the data export and will be placed to different columns of our Excel spreadsheet.
Fig.8. Data logger. Parser items.
Any new items may be added by clicking the "Add item" button (pic.8, pos. 7). Before adding an item the program will ask you about an item description. You can type any characters here, which will help you to remember a variable's content. For this example all 4 variables with their corresponding descriptions have been added. We've added all 12 items with descriptions from
Data 1 to Data 12.
Each parser item has a number of properties:
- Item name - this name will be bound to a column in the data export plug-in. It is a limited text description and cannot contain spaces and a few other characters;
- Parser's type -is a method, which the program will use for data extraction. Our parser has a few methods from simple to most powerful. In this example, where data is being placed in random place, we can use the most powerful method - regular expressions. This method allows extracting of any number of characters by a mask from any position of the data block. This mask
can be adapted to characters in the data packet;
- Data type -
is the data type of the characters extracted. If you'll specify a data type other than the "STRING" data type, then the module will try to convert the string value to the specified data type. The module allows you to configure several options that will be used at the conversion (see the "Data format" tab);
- Default value - the value specified here is to be used when data cannot be extracted from a data packet (or can't be converted to the specified data type);
- Regular expression - in
our case we needs numbers, that are placed after #1B[XX;XXH characters,
where XX may be one or two numbers. Then we should specify the following regular expression: \[\d+\;\d+H[\ \+]+([\d\.]+). Here \d+ is one
or more numbers. The expression in parenthesis is our number, which we
- Entrance position - all our values can be found by one
mask, but just an entrance position of each item in the data packet is vary. This field allows
this position in the data packet.
All other items have same parameters, just entrance position is different. In this example this value from the range of 1 to 12.
In the next tab, you can specify basic format options as per (pic.9). If you had specified the data type "String" in the item's parameters, then the first two options allow you to remove blank spaces from a value.
Fig.9. Data logger. Format of items.
Other options are unnecessary in our case, because all our items have the string
Click the "OK" button and close the parser configuration window and then click the "OK"
button in the options window.
Now that our parser is ready it is time for testing it. Connect your device and power it on if necessary. Check to see if you are able to receive a data block from the specified serial port. If the parser had been correctly set up, then you should see all parser items names and their values (fig.10) in the DDE server window below.
Fig.10. Data logger. DDE server window.
All parser items are now ready for export to an Excel spreadsheet. You can read about it in the second part.