Consolidating biological data into occurance of each biological unit per sample in table

ejohnston7 has asked for the wisdom of the Perl Monks concerning the following question:

The first one who helps me with this to the point where I can get it to work on my dataset can get a $15 paypal payment. Or you can have me donate it to PerlMonks, or a charity, or you could let me keep the money. Either way, this is very important to me and a deadline is coming up, so I am throwing that offer out there. This is my first time posting here so please bear with me. I am new to programming and have been successfully manipulating my data in perl until now. My data is in the following format:

occurence1    A    a__bear;c__black
occurence2    B    a__wolf;c__grey
occurence3    A    a__wolf;c__white
occurence4    A    a__bear;c__
occurence5    C    a__wolf;c__grey
occurence6    C    a__bear;c__brown
occurence7    A    a__wolf;c__
occurence8    B    a__wolf;c__
occurence9    C    a__bear;c__black
occurence10    C    a__wolf;c__
occurence11    A    a__wolf;c__red
occurence12    B    a__wolf;c__grey
occurence13    C    a__wolf;c__grey
occurence14    C    a__wolf;c__grey
occurence15    B    a__bear;c__brown
occurence16    C    a__bear;c__brown
occurence17    A    a__bear;c__
occurence18    A    a__bear;c__brown
occurence19    C    a__wolf;c__white
occurence20    B    a__wolf;c__grey
occurence21    B    a__bear;c__
occurence22    B    a__wolf;c__grey
occurence23    A    a__wolf;c__grey
occurence24    A    a__bear;c__brown
occurence25    C    a__bear;c__brown
occurence26    A    a__bear;c__brown
occurence27    C    a__bear;c__
occurence28    C    a__bear;c__brown
occurence29    B    a__wolf;c__red
occurence30    B    a__wolf;c__grey
[download]

Data is tab delimited. In the real dataset, instead of 30 occurrences (data points/lines) , there are ~5 million. There are also A LOT more animals in my data than just wolves and bears (I just included those two for simplification), so it would be unhelpful to create a script that contained if statements that specifically use words like 'bear' and 'wolf', because then we would need to make if statements for the hundreds of possible animals that can be spotted. Similarly, there are hundreds of possible colors for each animal. However, if you show me how to get to my desired end with this data then I'm sure I can apply it to my real data set. The data contained within the first column is not important, it just means that each row is a unique data point, that information doesn't need to exist in my desired output file. The numbers in the second column represent the sample ID. The third column is (you guessed it!) the animal ID. 'a__' is a broad observation of the animal and 'c__' is more specific information about the observed animal. For example, for the first data point, I observed a black bear in sample A. Note that for some data points, the color of the animal was not obtained. I am trying to generate a consolidated table that gives the number of occurrences of each animal per sample. I am imagining a table where the column headers are Sample IDs A, B, C, etc. and the rows represent the animal. The table will be filled with values (integers) that represent the number of times the animal occurred in that specific sample. It should look something like this....

    A    B    C
wolf    4    7    5
bear    6    2    6
[download]

I also want a table that gives similar information, but the rows represent the possible colors of the animal, like this

    A    B    C
black    1    0    1
grey    1    5    3
white    1    0    1
brown    3    1    4
red    1    1    0
[download]

Note that for the second table, if an occurrence (data point) did not contain a color, than that data should not be considered. Notice how the sum of all of the points in the second table is less than the sum of all data points in the second table. Comments throughout the script that let me know whats being done at each step would be super helpful, so I can alter it in the future for related data manipulations. Thanks in advance to anyone who tries to help me with this. I have been reading through forums for many hours trying to get answers.

Back to Seekers of Perl Wisdom