http://www.perlmonks.org?node_id=1005964

jagexCoder has asked for the wisdom of the Perl Monks concerning the following question:

Hi there! Iam working on some serial networking relating code and need to 'find' packets of interest that are within STX and ETX endings. Basically we have a device that sends data of the format: "STX|PAYLOAD|CRC|ETX" (without the | symbol) to another device that I am writing a simulator script for. Basically I want the receive routine to only start copying a valid data packet into the buffer once it sees an STX and stops reading once it sees an ETX or if 80 bytes has already been received and no ETX was found. I am using "Win32::SerialPort". Here's what I have so far:
sub writeSerialPort { my $WRITE_TO_PORT = shift; $port->write($WRITE_TO_PORT) || die("Writing to the serial port fa +iled: $!\n"); # STDOUT -> flush; sleep 2 } sub readSerialPort { sleep 3; $byte=$port->input; $buffer .= $byte; #print "\nBuffer: " . $buffer; #print "\n"; #print "\nByte: " . $byte; $byte = ""; # This variable is emptied each time new data comes i +n and is added to the buffer }
I thought of using Regular expressions to detect the STX and ETX components something along the lines of, if as an example, buffer contained 'cool': print $buffer =~ /(\0x63\0x6F\0x6F\0x6C)/; This does not return true. I'm somewhat new to perl so can someone please shed some light on this and give me guidance on how I can get the checking I described at the start working? Thanks!

Replies are listed 'Best First'.
Re: Detect STX and ETX hex in received string
by kcott (Archbishop) on Nov 28, 2012 at 07:28 UTC

    G'day jagexCoder,

    Welcome to the monastery.

    This regexp seems to do what you want:

    qr{ [$STX] ( [^$ETX]{0,80} ) [^$ETX]* [$ETX] }x

    Here's my test:

    $ perl -Mstrict -Mwarnings -e ' my ($STX, $ETX) = (chr(2), chr(3)); my $re = qr{ [$STX] ( [^$ETX]{0,80} ) [^$ETX]* [$ETX] }x; my $empty = $STX . $ETX; my $short = $STX . "a" x 79 . $ETX; my $exact = $STX . "a" x 80 . $ETX; my $long = $STX . "a" x 81 . $ETX; my $buf = $empty . $short . $exact . $long; print "1234567890" x 8, "\n"; print "$_\n" for $buf =~ /$re/g; print "1234567890" x 8, "\n"; ' 1234567890123456789012345678901234567890123456789012345678901234567890 +1234567890 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +aaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +aaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +aaaaaaaaaa 1234567890123456789012345678901234567890123456789012345678901234567890 +1234567890

    -- Ken

      This regex would not match anything unless an ETX was found, while the poster wanted to stop reading as soon as 80 bytes were found.

      (Apologies; originally this comment contained a typo such that it made no sense)



      When's the last time you used duct tape on a duct? --Larry Wall

        I'd say that's a matter of interpretation. I did consider the situation where no ETX existed; however, jagexCoder wrote "... packets of interest that are within STX and ETX endings.". Accordingly, I coded for STX and ETX to always be present. Perhaps wait for the OP to clarify this.

        [Aside: You appear to have removed the original typo, so I don't know what you're apologising for. Regardless, I accept your apology. :-) ]

        Update: Oh dear! Having posted this, I see I've also introduced a typo such that it made no sense.
        s/Perhaps wait for to OP can clarify this./Perhaps wait for the OP to clarify this./

        -- Ken

Re: Detect STX and ETX hex in received string
by ColonelPanic (Friar) on Nov 28, 2012 at 09:08 UTC
    Here is an example regex that will match as soon as 80 characters or an ETX is found. I used kcott's answer as a starting point.

    use Modern::Perl; my $STX = 'a'; my $ETX = 'z'; my $re = qr{ [$STX] (?: (?<match>[^$ETX]{80}) | (?:(?<match>[^$ETX]{0, +79}) [$ETX])) }x; $_='a12345678za1234567890123456789012345678901234567'. '890123456789012345678901234567890123456789012345678901234567890z'; say $+{match} while (/$re/g);

    Note that this uses named capture to put either set of matching parentheses into a single variable.



    When's the last time you used duct tape on a duct? --Larry Wall

      It appears that your solution only allows for a single-character STX or ETX token, due to the check for character classes (eg [$STX]). I am presently experiencing too much of a coffee deficiency to offer an alternative.

      Update: ... and to not recognize that STX and ETX are possibly symbols for one character tokens :-)

      --MidLifeXis

        I interpreted single-character STX and ETX codes as being intrinsic to the problem. Maybe that is not so, however. Here is a multi-character solution:
        use Modern::Perl; my $STX = 'foo'; my $ETX = 'bar'; my $re = qr{ $STX (?: (?<match>.*?) $ETX | (?:(?<match>.{80}))) }x; $_='foo12345678barfoo1234567890123456789012345678901234567'. '890123456789012345678901234567890123456789012345678901234567890bar'; say substr($+{match},0,80) while (/$re/g);

        (I added a substr() to limit the matched string to 80 characters. It appears to me that it's not trivial to meet these requirements entirely within the regex...though I'm sure others will quickly prove me wrong!)



        When's the last time you used duct tape on a duct? --Larry Wall
Re: Detect STX and ETX hex in received string
by space_monk (Chaplain) on Nov 28, 2012 at 15:18 UTC

    I'm not sure this is an answer, perhaps a question for other Perk Monks. :-)

    I'm not convinced that a regex is the way to go here, as I suspect the code should really just start accumulating data on detecting a STX and send it to the output on reaching 80 chars or an ETX. Adding regex detection seems, to me, to be adding a lot of overhead.

    A Monk aims to give answers to those who have none, and to learn from those who know more.
Re: Detect STX and ETX hex in received string
by GrandFather (Saint) on Nov 28, 2012 at 23:49 UTC

    I'd use a state machine and to avoid manifest global variables I'd wrap it up in a light weight object:

    #!/usr/bin/perl use strict; use warnings; use Win32::SerialPort; my $kSTX = "\x02"; my $kETX = "\x03"; my $obj = bless { port => Win32::SerialPort->new('COM1'), idle => 1 }; $obj->configurePort(); while (1) { my $crc = $obj->readSerialPort(); next if !defined $crc; print $obj->{buffer}; } continue { sleep 1; } sub configurePort { my ($self) = @_; $self->{port}->...; } sub writeSerialPort { my ( $self, $outStr ) = @_; $self->{port}->write($outStr) || die "Serial port write failed: $! +\n"; } sub readSerialPort { my ($self) = @_; while ( my $byte = $self->{port}->input() ) { next if $self->{idle} && $byte eq $kSTX; if ( $byte eq $kSTX ) { $self->{buffer} = ''; $self->{idle} = undef; next; } if ( $byte ne $kETX && 80 > length $self->{buffer} ) { $self->{buffer} .= $byte; next; } # Got end of record my $crc = 0; $crc += ord($_) for split '', $self->{buffer}; $crc &= 0xFF; $self->{idle} = 1; return $crc; } return; }

    The code assumes that the input function times out after some reasonable time and that nulls are not used in the payload data. None of this is tested code!

    Note too the use of strictures - always use strictures (use strict; use warnings;)

    True laziness is hard work
Re: Detect STX and ETX hex in received string
by jmlynesjr (Deacon) on Nov 28, 2012 at 23:08 UTC

    Timely post as I am planning to do something similar using Device::Serialport::Arduino. I plan to process the message one character at a time probably using encode/decode. To allow variable length packets, add a data character count in the payload in case your data contains something that looks like an ETX and to help in detecting framing errors sooner rather than later.

    I am I the only one old enough to remember SYNSYNSYNSTXDATAETXCRC as IBM Bi-Sync protocol from the 70's? In this context STX and ETX are ASCII character codes.

    James

    There's never enough time to do it right, but always enough time to do it over...

Re: Detect STX and ETX hex in received string
by jagexCoder (Novice) on Nov 29, 2012 at 01:58 UTC
    thanks for the assistance guys, I would like to further add that the packets (the payload, not STX/ETX) that are coming in are coming as a 'String of characters' in ASCII hex format. So the STX will be sent as a '2' and ETX will be sent as a '3'. The packets in between, for example if the payload has a '15' it will be sent as 2 bytes '1' and '5'. I did something like this:
    my @buffer = {}; # Create an empty buffer array my $byte = ""; sub readSerialPort { # $byte contains the incoming data from the other host on the ser +ial line # The buffer ensures that all the data is stored in one location. # Read from the buffer and manually empty it after processing! # sleep 3; $byte=$port->input; if ($byte eq chr(02)) { print "STX was found! " . $byte . "\n"; do { push(@buffer, $byte); } while ($byte != chr(03)); if ($byte eq chr(03)) { print "ETX was found! " . $byte . "\n"; } } } print "\nContents of byte is " . $byte . "\n". print "\nContents of buffer is " . @buffer . "\n".
    I am using Docklight to send the serial commands, if I transmit in hex: "02 32 34 23 23 23 03" the output is showing:
    Contents of buffer is 1 1 Contents of byte is 24###
    I was expecting the buffer to show [02|32|34|23|23|23|03]. Also because since it shows the buffer char array as 1, I can't do ,buffer[0], buffer[1] etc if I want to pick out certain parts of the received string of characters (which I will need to later on to distinguish between the payload and CRC between the STX and ETX). Any ideas? Is my code in the right direction? Thanks EDIT: I would like to mention that the "byte" array is displayed properly, though on the forum the STX and ETX is not showing however it is showing just fine on my console window.
      Solved the issue..thanks for your help everyone.