Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Splitting folded MIME headers into indivual headers?

by sebastiannielsen2 (Novice)
on Mar 02, 2015 at 20:43 UTC ( [id://1118474]=perlquestion: print w/replies, xml ) Need Help??

sebastiannielsen2 has asked for the wisdom of the Perl Monks concerning the following question:

Im trying to split a scalar containing multiple MIME header lines, that might be folded, into a Array that should hold one complete header per element

I tried with the following:

@fixedheaders = split(/\n\S/, $fixedmsgheader);

But that eats the first char in header lines. I need to split the header line so it takes folded header lines in consideration and NOT splitting a folded header line in the middle.

Replies are listed 'Best First'.
Re: Splitting folded MIME headers into indivual headers?
by jeffa (Bishop) on Mar 02, 2015 at 21:12 UTC

    Did you have a look at or try something like MIME::Parser? It seems to be able to do what you need without you having to code the parsing yourself.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      Im already using that. What I need to do, is to get the generated data in name => value format, for input to Sendmail::PMilter.

      If PMilter could accept a opaque header object, I would just pass the output from MIME::Parser to PMilter, but now I need to use: $ctx->Addheader(NAME, VALUE)

      thus I need to have access to indivual header lines in a way that makes it possible to iterate over the headers.

      Any folding must be kept as-is to keep the output RFC compliant.

Re: Splitting folded MIME headers into indivual headers?
by roboticus (Chancellor) on Mar 02, 2015 at 20:48 UTC

    sebastiannielsen2:

    You're losing your first character in the headers because that "\S". Maybe you should try something more like:

    @fixedheaders = split(/\n+/, $fixedmsgheader);

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      How would that work? A folded header line does contain one single \n, and then the next line begins with any whitespace. A newline (\n) followed by a non-whitespace is the beginning of a new header.

      The whitespace a folded line begins with, may not neccessarly be a \n. In face \n would not be permitted, because a double \n (\n\n) would mark the start of body

      Example of a non-folded header line, combined with a folded header line combined with a non-folded one:

      Subject: Hi you are beutiful Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Return-Path: <example@example.org>
      This should result in:
      $fixedheaders[0] = "Subject: Hi you are beutiful"; $fixedheaders[1] = "Content-Type: text/plain;\tformat=flowed;\tcharset +=\"iso-8859-1\";\treply-type=original"; $fixedheaders[2] = "Return-Path: <example@example.org>";

        sebastiannielsen2:

        In that case, I'd use:

        @fixedheaders = split /\n\b/, $fixedmsgheader;

        Update: checked my work with a test program:

        cat splitest.pl #!/usr/bin/env perl use strict; use warnings; my $t = q{ Subject: Hi there Content-Type: text/plain; format=flowed; charset="iso-1189-1"; Return-Path: <example@example.org> }; my @fields = split /\n\b/, $t; print join("\n***\n", @fields), "\n"; localadmins-MacBook-Pro-2:~ [mmason] $ perl splitest.pl *** Subject: Hi there *** Content-Type: text/plain; format=flowed; charset="iso-1189-1"; *** Return-Path: <example@example.org>

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

        OOps the result was a Little bit wrong. It should be:

        $fixedheaders[0] = "Subject: Hi you are beutiful"; $fixedheaders[1] = "Content-Type: text/plain;\n\tformat=flowed;\n\tcha +rset +=\"iso-8859-1\";\n\treply-type=original"; $fixedheaders[2] = "Return-Path: <example@example.org>";

        Eg with newlines Before the tabs.

Re: Splitting folded MIME headers into indivual headers?
by hdb (Monsignor) on Mar 02, 2015 at 21:32 UTC

    What about good ol' line by line processing?

    use strict; use warnings; use Data::Dumper; my $t = q{ Subject: Hi there Content-Type: text/plain; format=flowed; charset="iso-1189-1"; Return-Path: <example@example.org> }; my @fields; for ( split /\n/, $t ) { push @fields, $_ and next if /^\S+:/; $fields[-1] .= $1 if /\s*(.+)/; } print Dumper \@fields;
Re: Splitting folded MIME headers into indivual headers?
by johngg (Canon) on Mar 03, 2015 at 10:20 UTC

    I tried with the following:

    @fixedheaders = split(/\n\S/, $fixedmsgheader);

    But that eats the first char in header lines.

    Have you tried using a look-ahead?

    @fixedheaders = split(/\n(?=\S)/, $fixedmsgheader);

    Not tested, but that should stop the regex consuming the first character.

    Cheers,

    JohnGG

Re: Splitting folded MIME headers into indivual headers?
by project129 (Beadle) on Mar 03, 2015 at 10:26 UTC

    Hi there!

    I propose use perl regexp positive look ahead:

    split /\n(?=\S)/, $fixwdmsgheader;

    p.s.: also please look at: Email::MIME - mail rfc have a lot of hidden issue/s

    good luck!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1118474]
Front-paged by GotToBTru
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 22:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found