http://www.perlmonks.org?node_id=961986

ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

Say I have a file with a header, and I want to read this file into a perl script and figure out which columns are which automatically based on pattern matching. Is there a trick in perl to have it return the column number without reading the header in as an array, split by field, and then for looping through the array?

Heres an example data input at the bottom, but the columns are always going to be in different order so I want a script to match the column header and then return the column number that that column header matched so I can set it equal to a variable and use it like this later print if $array[$variable] >15 (just and example)

Does anyone know the most efficient quick way to do something like this or is the only way to split the header into an array and looping through it for each variable you want to match and give a value? Thank you for your time!

Non-Syn Splice dbSNP N (Var Depth) N Total Depth N Freq + T (Var Depth) 1 0 0 0 34 0 11 1 0 0 0 54 0 14 1 0 0 0 42 0 11

Replies are listed 'Best First'.
Re: Shortcut to identify column numbers in a data file based on header
by tobyink (Canon) on Mar 27, 2012 at 19:53 UTC

    Looking for something like this?

    use 5.010; my @headers = do { chomp($_ = <DATA>); split /\t/ }; while (<DATA>) { chomp; my %F; @F{@headers} = split /\t/; say $F{'N Total Depth'}; } __DATA__ Non-Syn Splice dbSNP N (Var Depth) N Total Depth N Freq + T (Var Depth) 1 0 0 0 34 0 11 1 0 0 0 54 0 14 1 0 0 0 42 0 11
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Shortcut to identify column numbers in a data file based on header
by jandrew (Chaplain) on Mar 27, 2012 at 18:25 UTC

    Its not clear how you will receive the file or parse it but once you have managed that task you can use a hashref for look-up to retrieve values in a table. The only caveat being that the header lookup must be built individually for each file.