I'm not sure this is responsive, but I'm trying to ignore the fields other than those for which you give an example, in the surmise that that's your problem area:
#!/usr/bin/perl -w
use 5.018;
use strict;
#1149716
=head
I would like to extract a piece of data from one field that has multip
+le fields in it. The original field is a long description that usuall
+y contains a #F123456, #123456, #123-F123456, #123-123456, or #12AB-1
+23456 in it. This data floats around from left to right and there sho
+uld be whitespace before the #. Also, the end of the data is either w
+hitespace, or the end of the field.
=cut
my @data = ("TRAY HINGED PLSTC 20 CAV #F32473",
"BOX HSC,35-3/4X17-1/4 X 50-1/2 SIMULATOR TALL BOX",
"PAD, FOAM, 24 X 24 X 1/4 #16193
+
112 SHEETS PER ROLL, ORDER IN FULL ROLLS",
"PKG LIST,ASST ARM,RAD,300
#F37784",
"PAD, TOP CAP RE17-30048 #F30121
+
CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X 4-3/4",
"foo bar #379460 best F11",
"F1234 SIMULATION",
);
for my $data (@data) {
# say "\t|$data|\n\n";
chomp $data;
if ( $data =~ /\n/ ) {
$data =~ s/\n//g;
}
if ( $data =~ /(^.* #[A-Z]*\d+.*$)/m ) {
say "\n\$data matches regex\n";
$data =~ s/ +/ /g; # clean up excess spaces
say "$data \n";
} else {
say "\n\t The data, $data, does NOT MATCH\n";
}
}
The regular expression may be obscure: here's an explanation:
C:perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/
+(^.* #[A-Z]*\d+.*$)/)->explain();"
The regular expression:
(?-imsx:(^.* #[A-Z]*\d+.*$))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
( group and capture to \1:
# NB: I did NOT need the parens as there's no use of the capture
# My bad, but harmless except for shoving bits & bytes around
# when they didn't need to be disturbed.
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
# ' #'
----------------------------------------------------------------------
[A-Z]* any character of: 'A' to 'Z' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
$ before an optional \n, and the end of
the string
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
<p>And the output is thus:</p>
<c>C:1149716.pl
$data matches regex
TRAY HINGED PLSTC 20 CAV #F32473
The data, BOX HSC,35-3/4X17-1/4 X 50-1/2 SIMULATOR TALL BOX,
+ does NOT MATCH
$data matches regex
PAD, FOAM, 24 X 24 X 1/4 #16193 112 SHEETS PER ROLL, ORDER IN FULL ROL
+LS
$data matches regex
PKG LIST,ASST ARM,RAD,300 #F37784
$data matches regex
PAD, TOP CAP RE17-30048 #F30121 CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X
+ 4-3/4
$data matches regex
foo bar #379460 best F11
The data, F1234 SIMULATION, does NOT MATCH
and here's the output of my code:
$data matches regex
TRAY HINGED PLSTC 20 CAV #F32473
The data, BOX HSC,35-3/4X17-1/4 X 50-1/2 SIMULATOR TALL BOX,
+ does NOT MATCH
$data matches regex
PAD, FOAM, 24 X 24 X 1/4 #16193 112 SHEETS PER ROLL, ORDER IN FULL ROL
+LS
$data matches regex
PKG LIST,ASST ARM,RAD,300 #F37784
$data matches regex
PAD, TOP CAP RE17-30048 #F30121 CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X
+ 4-3/4
$data matches regex
foo bar #379460 best F11
The data, F1234 SIMULATION, does NOT MATCH
HTH. Sometimes you'll get better answers if you trim your code to the mere few (<20) lines that demonstrate only the problem you want to address. I see you want more than what's here in terms of advice on the code you supplied but don't have time to try to create jumbled CSV that would give a shot at assessing the efficiency and/or clarity.