$str = '4 Figure 2. Figure 3. 5'; $str = qr{$str}; $str =~ m|^((((?![^\s]+<\/w:t>).)+).+?(<\/w:t>))|s; $new_str = $1; $new_str =~ s|(\w(((?!).)+)).+$|$1|s; print $new_str; output: ======= 4 Figure 2.