|Welcome to the Monastery|
thechartist's scratchpadby thechartist (Monk)
|on Dec 29, 2017 at 04:57 UTC||Need Help??|
101 Perl PDL Exercises for Data Analysis (March 2019 with PDL 2.019)
Before "data science" became a fashionable topic in computing, Perl hackers have been cleaning and analyzing data since Perl was written. The following tutorial provides Perl examples to the problems posed in: 101 NumPy Exercises for Data Analysis
My purpose is to demonstrate that Perl not only has the necessary tools to complete common data analysis tasks, but you are likely to get better performance out of Perl, with minor effort.
The philosophy of Perl has always been "There is more than one way to do it." Data analysis is no exception. While PDL has excellent functionality "out of the box", you might find it more effective to use individual CPAN modules to solve particular problems.
One other project to keep an eye on is Rperl -- a restricted subset of Perl that compiles to C++. It promises to give the best of both worlds: rapid prototyping to minimize developer time, with efficient code generation to minimize computational resources. At the time of this writing, Rperl appears to only work on Ubuntu Linux.
One current area of weakness -- the world of Machine Learning. This isn't so much due to any flaws in Perl, but accidents of history. There are bindings to some modern ML libraries (mxnet), but extensive experience using them is currently lacking in the Perl community. But hopefully this tutorial will start to change that.
This document assumes you know some basic programming -- loops, conditionals, variables, etc. Perl syntax is similar to any C derived language. Perl has a few fundamental data types:
There are others (typeglobs and references), but they will not be needed for the exercises that follow.
As always in Perl, there is more than one way to do anything. For PDL, one can enter simply invoke the Perl interpreter at the command line (like any othe Perl script), or use a REPL (Read, Evaluate, Print, Loop) interface for interactive analysis. This exercise will show the Perl PDL one liner entered at the command shell, but code in between quotation marks should work at the REPL also.
Exercise 1 1. Import PDL and print the version.
2. Create a 1D array of numbers from 0 to 9
3. Q. Create a 3×3 numpy array of all True’s
4. Q. Q. Extract all odd numbers from arr = [0,1,2,3,4,5,6,7,8,9].
5. Q. Replace all odd numbers in arr (from question 4) with -1.
Answer: perl -MPDL -e "$arr = sequence(10); $odd = $arr->where($arr % 2 == 1); $odd .= -1 ; print $arr;"
6.Replace all odd numbers in arr with -1 without changing arr
Answer: $ perl -MPDL -e "$arr = sequence(10); $out = sequence(10); $odd = $arr->where($arr %2 == 1); $odd .= -1; print $out, $arr;" Note: the '.=' operator is a special type of assignment operator in the PDL context. Ordinarily this is used for string concatenation.
7. Q. Convert a 1D array to a 2D array with 2 rows.
8. Q. Stack arrays a and b vertically
9. Q. Stack the arrays a and b horizontally.