|
Howdy, partner! Name's Apple Fritter, pleasure to meet y'all! I use Perl, but I don't know that much about it (yet). I'm trying to change that, so I frequent the Monastery, reading others' answers and code to learn, and providing my own answers and code to hone my skills.
If I come across useful advice, tips, modules, code snippets, articles etc., I usually add it to my home node (which you are reading right now) for future reference. Maybe you'll find it useful, too!
Not affiliated with Tom Owad's applefritter.com.
Note: I'm not active on Perlmonks anymore. I may still update my home node when I come across items worth adding.
For new users:
Introductions to the Monastery:
- PerlMonks for the Absolute Beginner
- The Perl Monks Guide to the Monastery
- PerlMonks FAQ
- New Monks
- Spirit of the Monastery:
☞ Spirit of the Monastery:
Perlmonks relies upon a spirit of fellowship which places the responsibility for brotherly conduct on the individual Monk.
Highly opinionated and informed contributions are encouraged, but never at the expense of mutual respect among Monks, adherence to the agreed upon rules of the Monastery, or the basic joy of perl.
Protection of these vital elements serves to eliminate adverse conduct from the Monastery.
Such actions as taunting of other Monks, aggression in posts, belligerent intimidation, intentional disrespect, or other self-aggrandizing inconsiderate behavior are contrary to the Spirit of the Monastery and must be avoided by all Monks.
☜
On civility/kindness:
- Larry Wall, keynote address at YAPC::Europe 2015: Get Ready to Party:
You don't always have to agree with your companions on the road, but it certainly helps to be friendly if you disagree.
- Don't bite the newbies
- perlpolicy:
Always be civil. [...] Civility is simple: stick to the facts while avoiding demeaning remarks and sarcasm. It is not enough to be factual. You must also be civil. Responding in kind to incivility is not acceptable.
While civility is required, kindness is encouraged; if you have any doubt about whether you are being civil, simply ask yourself, "Am I being kind?" and aspire to that.
- "The first rule of ethics is "don't be a dick", from which all other rules logically follow."
- Re^7: RAM: It isn't free . . . (the aggrieved troll their troll)
- Re^4: When do we change our replies? (approving):
My main advice to everybody related to this is for one to only respond to questions where one has something helpful to offer in response and for which one is particularly suited to answer. [...]
If a question annoys you, then minimize your annoyance by immediately moving on to something more enjoyable for you. Please try to refrain from sharing your annoyance so that we all get to suffer from it. Most of you are probably even clever enough to figure out a lot of the questions that are likely to end up annoying you so you can avoid even clicking through to them in the first place.
If a question annoys everybody, then everybody will ignore it. The history of the internet says that's one of the best ways to end something. If the question doesn't annoy everybody, then we have a case of somebody asking a question and some others willingly answering the question via a web site. That sounds a lot like "success".
- Larry Wall, keynote address at YAPC::Europe 2015: Get Ready to Party:
-
Further reading: Useful homenodes
Introductions to Perl and resources for learning Perl:
Introductions, first steps and general information:
Tutorials:
Best practices and other information:
- On strict and warnings:
- Use strict and warnings
- The strictures, according to Seuss
- why am I getting odd behavior on DESTROY - a great example of how and why use strict; is important
- Using modules and not reinventing the wheel:
- Yes, even you can use CPAN - installing CPAN modules without root access
- Top Seven (Bad) Reasons Not To Use Modules
- On strict and warnings:
FAQs:
- The perlfaq manpage
- perlfaq1 - General Questions about Perl
- perlfaq2 - Obtaining and Learning about Perl
- perlfaq3 - Programming Tools
- perlfaq4 - Data Manipulation
- perlfaq5 - Files and Formats
- perlfaq6 - Regular Expressions
- perlfaq7 - General Perl Language Issues
- perlfaq8 - System Interaction
- perlfaq9 - Web, Email and Networking
- Just the FAQs (from The Perl Journal, by Dominus)
- Categorized Questions and Answers
- The perlfaq manpage
Books:
- Learning Perl, aka "The Llama" (7th edition, October 2016)
- Beginning Perl (by Ovid; August 2012)
- Intermediate Perl, aka "The Alpaca" (July 2012)
- Mastering Perl (January 2014)
- Programming Perl, aka "The Camel"; a must-have (February 2012)
- Modern Perl (4th Edition, December 2015; free PDF)
- Modern Perl (April 2014; PDF here)
- Advanced Perl Programming (June 2005)
- Perl Best Practices (July 2005)
- Effective Perl Programming, aka "The Shiny Ball" (April 2010)
- Perl Cookbook, aka "The Big-Horn Sheep" (August 2003)
- Perl Hacks (May 2006)
- Perls of Wisdom (December 2004)
- Introducing Regular Expressions (July 2012)
- Mastering Regular Expressions (August 2006)
- Mastering Algorithms with Perl, aka "The Timber Wolf" (August 1999)
- Higher-Order Perl (March 2005; PDF here)
- Perl Testing (July 2006)
- Perl Medic (March 2004)
- Professional Perl Programming (co-authored by dada; February 2001)
- Object-Oriented Perl (January 2000)
- Unreleased: Modern Object Oriented Programming in Perl
- Parsing with Perl 6 Regexes and Grammars (by moritz; 2017)
For non-IT folks, e.g. biologists:
- UNIX and Perl to the Rescue!: A Field Guide for the Life Sciences (and Other Data-rich Pursuits) (August 2012)
There's also many books dedicated to specific topics such as Perl/Tk, DBI, Perl and ☞XML, ☞CGI programming with Perl, and much much more; see Perl Reference Materials: Books for an (outdated) list.
Other lists and resources:
- books.perl.org
- O'Reilly's catalog of Perl books
- Unwritten Perl Books
- Day 12 – The Year of Perl 6 Books
Reviews, opinions etc.:
- Book Reviews
- xdg on various Perl books: Re^5: Why so much hate?
Asking questions (on Perlmonks and elsewhere):
How to ask questions (based on ww, Re: Replace key pair value from one to other file):
- Above all, welcome to the Monastery!
- Read the instructions ("Asking questions effectively", "Formatting your write-up", below).
- Read the documentation.
- Show effort. Write some code; at the very least, try. Help is free; doing your job for you is not.
- Describe what you want to accomplish. Be precise.
- Show us your code.
- Describe failures, expected results, and actual results.
- If applicable, show us verbatim (!) error messages/warnings.
- If applicable, give us some sample data.
- Give us the larger picture: tell us what you want to achieve, not just how you decided to go about achieving it. There may be better ways of doing it that you haven't contemplated.
- Remember, we're here to help, but we need your to help you.
Asking questions effectively:
- On asking for help
- How do I post a question effectively?
- How do I compose an effective node title?
- I know what I mean. Why don't you?
- How (Not) To Ask A Question
- Clean your room
- XY Problem
- How to RTFM
- The SSCCE: Short, Self Contained, Correct (Compilable) Example
- A few guidelines for good code, from Laurent_R: Re: text files are printed after the end of second module
- Where should I post X? / PerlMonks Sections
- Is PM a good place to get answers for homework?
- How to ask better questions using Test::More and sample data
Formatting your write-up:
Other places to get help:
Other places learn about Perl:
- Perl Weekly (weekly email newsletter)
N.B. when crossposting to several sites, it is considered polite to inform readers of this and provide links to avoid unnecessary/duplicated effort.
For established users:
Combinatorics:
- CPAN: Algorithm::Combinatorics
- CPAN: Math::Combinatorics - pure Perl, slower
- Faster alternative to Math::Combinatorics
Daemons:
Databases:
- CPAN: DBI, as well as DBD::*
- Databases made easy
- DBI recipes
- Database Programming
Also see ☞Unicode flags for database drivers further down.
Data munging:
- CPAN: Data::Compare - compare nested data structures
- CPAN: Data::Walker - interactively navigate Perl data structures
- CPAN: Data::Diver
- CPAN: Data::Rmap - recursive map
- CPAN: Data::Table
Data structures:
- perldsc
- Hashes:
- Remember the Perl motto: when in doubt, use a hash! — Athanasius, Re^3: Need help in extracting timestamp from the line in a file
- CPAN: Tie::IxHash - ordered hashes
- CORE: Hash::Util - locked hashes, among other things
- Peter Kankowski: Hash functions: An empirical comparison
Date/time manipulation:
Articles:
Parsing:
- CPAN: Time::Piece (core)
- CPAN: Date:Extract
- CPAN: Date::Manip
- CPAN: Date::Parse
- CPAN: DateTime::Format::Natural
- CPAN: HTTP::Date
- CPAN: Time::ParseDate
Time zone conversion:
Debugging:
- brian's Guide to Solving Any Perl Problem
- Basic debugging checklist
- Unbelievably Obvious Debugging Tip
- Use strict warnings and diagnostics or die
- Debugging and Optimization
- CPAN: Data::Dumper
- Set $Data::Dumper::Indent = 0; for deeply-nested data structures.
- Set $Data::Dumper::Sortkeys = 1; to sort hash keys. (Recommended!)
- CPAN: Data::Dump::Streamer - alternative to Data::Dumper
Design patterns:
- Perl Design Patterns wiki
- Dominus: Design patterns of 1972
- Ralph Johnson:
reply to the above(not available anymore, not on archive.org) - Dominus: Ralph Johnson on design patterns (reply to Johnson)
- Ralph Johnson:
- Design Patterns Considered Harmful
- Dominus: "Design Patterns" Aren't
- Peter Norvig: Design Patterns in Dynamic Languages
Distros, packages etc. (e.g. for Windows users):
- Citrus Perl (Linux/*nix, OS X, Win32)
- Strawberry Perl (Win32)
- DWIM Perl (Linux, Win32)
- Perlbrew (Linux/*nix)
- Super-concise Perlbrew HOWTO: Re^2: perlbrew and cpan (perlbrewintro)
- ActiveState ActivePerl (Linux/*nix, OS X, Win32) - less recommended
- Modify the system Perl, or install your own Perl?
- Install your own Perl: RFC: (Do Not) Modify the System Perl
- Modify the system Perl: Re: RFC: (Do Not) Modify the System Perl
Email:
Addresses:
- Email address validation: please stop
- Email address validation: an addendum
- I Knew How To Validate An Email Address Until I Read The RFC
- CPAN: Mail::RFC822::Address
- CPAN: RFC::RFC822::Address
- CPAN: Email::Valid
- CPAN: Mail::Address
- CPAN: Email::IsEmail
- Is this email address valid?
- input - E-mail address - how to check string ?
Errors / Warnings:
- Making die print stack traces:
- Forcing stack trace?
- Using only core modules (Carp):
*CORE::GLOBAL::die = sub { require Carp; Carp::confess };
- More elegant: Carp::Always. Can be used on the command line: perl -MCarp::Always script.pl
- Making die print stack traces:
eval / Exceptions:
- eval
- What does eval actually do?
- Do the Monks recommend Try::Tiny for eval work?
- Try::Tiny vs. TryCatch: assigning a value to a variable inside of the try-block
- Try::Tiny - check the BACKGROUND section of the documentation
- Devel::EvalError
External commands:
- perlop: qx//, ``: perlop#Quote and Quote like Operators, #Quote Like Operators
- Open a process for both reading/writing (STDIN/STDOUT): IPC::Open2
- Open a process for both reading/writing/error handling (STDIN/STDOUT/STDERR): IPC::Open3
- Quoting strings for the shell: String::ShellQuote
- Unix shell versus Perl
File input/output:
Input:
- CPAN: File::ReadBackwards
- CSV files:
- How to read a CSV file using Perl? - executive summary: use Text::CSV.
- CPAN: Text::CSV
- Don't attempt your own CSV file handling.
- set the binary and auto_diag attributes when creating a new Text::CSV object.
- use ->getline() rather than reading lines yourself and calling ->parse(), or your script will break on embedded newlines.
- CPAN: Text::CSV_PP - also handles UTF-8 correctly; see Re^5: Speeds vs functionality
Output:
- CPAN: IO::Tee - write to many files/handles at once.
File names:
- CORE: File::Basename - split filenames into path and (actual) filename.
Graphs (the mathematical kind):
General information/further reading:
- See the ☞book "Mastering Algorithms with Perl", chapter 8, pp. 273-352.
- Wikipedia: Transitive reduction (not implemented in Graph)
Useful modules:
- CPAN: Graph
- Note that this module does not handle arbitrary scalars as nodes; strange things will happen if you try to add e.g. references to nested data structures. This is arguably a bug; to work around it, keep your data in a separate hash, indexed by the names of the vertices in your graph.
- CPAN: Graph::ReadWrite - serialize graphs.
- Note that Graph::Reader::Dot is broken; it ignores isolated vertices, so use e.g. XML instead if you need to read graphs again. Graph::Writer::Dot is NOT broken and can be used to generate Graphviz files for use with external tools (e.g. Graphviz itself, Tulip etc.)
- CPAN: Graph
External tools:
Graphs, charts and plots:
- There is no great plotting module for Perl. You may want to consider shelling out to Python and using matplotlib.
Modules:
- CPAN: GD::Graph - decent for bar/line charts, pie charts suck, PNG/GIF output
- CPAN: SVG::TT::Graph - decent pie charts, limited bar charts, SVG output
HTML:
Parsing:
General tips:
- Don't use regular expressions. You will get it wrong; use a HTML parsing module.
- You may be able to use ☞XML parsing if you're dealing with XHTML.
Articles:
- A. Sinan Unur: Parsing HTML with Perl (February 2014)
- Kendrew Lau: Analyzing HTML with Perl (January 2006)
- Using the HTML::Parser module (undated)
Modules:
- CPAN: HTML::TokeParser::Simple
- CPAN: HTML::TableExtract - for parsing tabulated data
- CPAN: HTML::PullParser
- CPAN: HTML::Parser
- CPAN: HTML::TreeBuilder
- CPAN: HTML::TreeBuilder::XPath
- CPAN: Mojo::DOM
List processing:
- CORE: List::Util - reduce, any/all, first, sum/product, min/max, pairgrep, pairmap etc.
- CPAN: List::MoreUtils - uniq, zip, etc.
- CPAN: List::AllUtils (the previous two in one convenient module)
- Missing from List::Util / List::MoreUtils / List::AllUtils: pairwise_distinct (workaround: uniq(@list) == @list).
- CPAN: List::Compare - union, intersection, differences, symmetric difference etc.
- CPAN: List::Compare::Functional - non-object-oriented version
Logic:
Logging:
- CPAN: Log::Log4Perl
Math:
Basic arithmetic:
Large numbers:
- Core: Math::BigInt, Math::BigFloat, Math::BigRat
- Also use Math::BigFloat lib => 'GMP'; if you'd like to use the libgmp backend.
- Use bignum and bigrat for transparent upgrading of constants.
- Caveat: Using bigrat breaks Math::BigRat's ->as_float() method. See core bug 127802 and CPAN bug 113588. Workaround: my $floatval = do { no bigrat; eval $ratval };
- CPAN: Math::MPFR
- CPAN: Math::BigApprox - if approximate results are enough
- Core: Math::BigInt, Math::BigFloat, Math::BigRat
Marshalling/serialization:
- Use JSON or a similar format.
- CPAN: JSON::XS
- CPAN: Cpanel::JSON::XS (also supports older Perls all the way back to 5.6)
- Serialization to binary:
- CPAN: Sereal
- BSON - "Binary JSON, [...] a binary-encoded serialization of JSON-like documents"
- CPAN: BSON
- Do not use Storable, it's evil. See e.g. Storable- long integer size.
- Use JSON or a similar format.
MediaWiki:
Bots/API:
- CPAN: MediaWiki::API - low-level access to the MediaWiki API
- CPAN: MediaWiki::Bot - high-level bot functionality, uses MediaWiki::API. Unmaintained; logging in does not work.
- API docs on mediawiki.org
- Bot manual on mediawiki.org
OOP (object-oriented programming):
- perlootut
- perlobj
- Damian Conway's ten rules for when to use OO
- Often Overlooked OO Programming Guidelines
- tobyink: Method Privacy in Perl
- Inside-out objects:
- merlyn: Unix Review Column 63
- CPAN: Class::InsideOut
- CPAN: Object:InsideOut
- CPAN: MooX::InsideOut
- CPAN: MooseX::InsideOut
Operators:
- <> is shorthand for <ARGV>, which is just as magic. Corollary: *ARGV is magic as well.
- .. and ... (range/flip flop):
- Flipin good, or a total flop?
- The Scalar Range Operator
- flip-flop interpolation
- Resetting a flip-flop operator
- Multi-state flip-flops (flip-flop-flaps?): Re^8: Multi-stage flip-flop? ( till() - proof of concept)
- Perl's secret operators: perlsecret
- More secret operators: new "!"-based secret operators
Optimization:
- Premature optimization is the root of all evil.
- Athanasius, Recamán's sequence and memory usage:
[O]ptimising an algorithm may actually consist in optimising its underlying data structures. Obvious? Yes, but still worth a reminder now and then.
- raven667, Re: Firefox 50.0 (lwn.net):
Efficiency gains should be targeted based on real world profiling and not based on review of code that "looks slow" as you will waste a ton of time lost in the details, chasing down non-existent performance problems, sometimes making things worse if you fight the compiler, and missing the big issues which are usually more fundamental to the design and data structure usage or locking in the hottest paths of the application.
- CPAN: Devel::NYTProf - powerful, fast, feature-rich Perl source code profiler
- Caching:
- The two hardest problems in computer science are cache invalidation and cache invalidation.
- CPAN: Memoize - transparently cache function results
Option processing:
- CPAN: Getopt::Long
- Note: calling GetOptions removes all options from @ARGV, not just the ones specified in the call to GetOptions. As a result you cannot have more than one call to GetOptions; keep this in mind when you want to mix e.g. user-defined subroutines to handle options and storing option values in a hash.
- The Dynamic Duo --or-- Holy Getopt::Long, Pod::UsageMan!
- GetOpt Organization
- CPAN: Getopt::Long
References:
- Cargill's quandary:
Any design problem can be solved by adding an additional level of indirection, except for too many levels of indirection.
- perlref
- References quick reference
- Mini-Tutorial: Dereferencing Syntax
- Cargill's quandary:
Regular expressions, parsing and grammars:
HOWTOs, tutorials etc.:
- perlretut - tutorial
- perlrequick - quick start
- Using Look-ahead and Look-behind
Debugging regular expressions:
- use re 'debug';
In-depth documentation and references:
- perlre
- perlrebackslash - backslash sequences and escapes
- perlrecharclass - character classes
- perlreref - reference
- perlfaq6
- Pattern Matching, Regular Expressions, and Parsing - regular expression tutorials on (mostly) PerlMonks
- Internals:
- perlreapi - plugin interface (API)
- perlreguts - "guts" of the regular expression engine
- perlre
Books:
- Introducing Regular Expressions (July 2012)
- Mastering Regular Expressions (August 2006)
Useful CPAN modules:
- CPAN: Regexp::Common
- CPAN: Regexp::Grammars
- CPAN: Regexp::Optimizer
- CPAN: Regexp::Compare
- CPAN: Pegex (see Pegex::API for more information)
- CPAN: YAPE::Regex::Explain - explain regular expressions
Misc.:
Security:
Signals:
Sorting:
-
- Perl uses mergesort by default, with an O(n log(n)) worst case performance.
- Fine-grained control over sorting algorithms: the sort pragma
Idioms:
- Schwartzian Transform
- Wikipedia: Schwartzian transform
- Orcish Maneuver (note: use //= on Perl ≥5.10)
- Advanced Sorting - GRT - Guttman Rosler Transform
Articles, columns and essays:
- merlyn: Sorting (Unix Review Column 6, January 1996)
- A Fresh Look at Efficient Perl Sorting
HOWTOs
Useful CPAN modules:
- CPAN: Data::Table - for two-dimensional data, allows column-based sorting
- CPAN: Sort::Key
-
Statistics (the mathematical kind):
- CPAN: Statistics::Descriptive
- CPAN: PDL
- CPAN: PDL::Stats
- CPAN: Statistics::ANOVA - various types of parametric and non-parametric 1-way variance analysis
Temporary files:
- Using Temporary Files in Perl
- CORE: File::Temp
- CORE: IO::File - use ->new_tmpfile
- ☞ Perl Cookbook, 7.5: Creating Temporary Files, pp. 232-234
Text input/output:
Input:
- CPAN: IO::Prompter
- CPAN: IO::Prompt::Tiny - portable, simple-to-use
- CPAN: IO::Prompt::Hooked - based on IO::Prompt::Tiny, provides more options
Output:
- CPAN: Text::Table
- CPAN: Term::ANSIColor
- tee(1)ing STDOUT to a file:
- CPAN: PerlIO::Util; use e.g. *STDOUT->push_layer(tee => 'stdout.log');
Threads:
General:
Sharing data between threads:
UIs:
Unicode/UTF8:
HOWTOs, BCPs, tips and tricks:
- Keep in mind the difference between bytes, codepoints, and characters ("extended grapheme clusters"). Variable-length encodings (UTF8) complicate things. So do combining diacritics.
- Make STDOUT use UTF-8: binmode STDOUT, ':utf8'; (from perldiag).
- "Magic incantation" for defaulting to UTF8 when opening files, and also for STD*: use open IO => ':utf8', ':std';. Actually, :encoding(UTF-8) may be better than :utf8, see Re: A UTF8 round trip with MySQL.
- hippo, in Re: Matching/replacing a unicode character only works after decode():
The correct order of operations for working with encoded data (whether utf8 or any other encoding) is:
- Input
- Decode
- Operate
- Encode
- Output
If you don't decode your input you'll be comparing apples and elephants which is why your regex fails to match. However, if you do no operations on the data at all, then you can skip the middle three steps because your perl script in that case is just essentially a pipe between your input (eg. database) and your output (eg. web page).
- UTF-8 text files with Byte Order Mark
- Check whether Perl thinks your data is UTF8: $flag = utf8::is_utf8($string);
- Unicode flags for database drivers:
- DBD::MySQL: mysql_enable_utf8
- DBD::SQLite: sqlite_unicode (but beware that this will also affect BLOBs!)
- MySQL and UTF-8:
[...] MySQL offers a "charset" named UTF8. Guess what, it's not UTF8. It's actually a synonym for UTF8MB3, which is MySQL's bizarre internal "UTF8 except we only allow 3 bytes per character" rule. If you actually need UTF8 you must upgrade to a very new version and explicitly ask MySQL for "UTF8MB4".
Anybody who has used MySQL before can guess what happens if you try to insert actual Unicode data (say, an HTML-ised comment your PHP blogging framework wants to store) into one of these UTF8 columns. Afraid to incur your wrath with an error you probably haven't handled correctly, MySQL will quietly truncate the string, removing everything from the offending codepoint onwards. [...]
- Catching "Unicode non-character" warnings - with a good and exhaustive reply by Tom Christiansen (might be outdated)
- perlunicook -- cookbookish examples of handling Unicode in Perl (also here, by Tom Christiansen)
- JSON::XS has "a few notes on Unicode and Perl":
Since this often leads to confusion, here are a few very clear words on how Unicode works in Perl, modulo bugs.
-
Perl strings can store characters with ordinal values > 255.
This enables you to store Unicode characters as single characters in a Perl string - very natural.
-
Perl does not associate an encoding with your strings.
... until you force it to, e.g. when matching it against a regex, or printing the scalar to a file, in which case Perl either interprets your string as locale-encoded text, octets/binary, or as Unicode, depending on various settings. In no case is an encoding stored together with your data, it is use that decides encoding, not any magical meta data.
-
The internal utf-8 flag has no meaning with regards to the encoding of your string.
Just ignore that flag unless you debug a Perl bug, a module written in XS or want to dive into the internals of perl. Otherwise it will only confuse you, as, despite the name, it says nothing about how your string is encoded. You can have Unicode strings with that flag set, with that flag clear, and you can have binary data with that flag set and that flag clear. Other possibilities exist, too.
If you didn't know about that flag, just the better, pretend it doesn't exist.
-
A "Unicode String" is simply a string where each character can be validly interpreted as a Unicode code point.
If you have UTF-8 encoded data, it is no longer a Unicode string, but a Unicode string encoded in UTF-8, giving you a binary string.
A string containing "high" (> 255) character values is not a UTF-8 string.
It's a fact. Learn to live with it.
-
- Markus Kuhn: UTF-8 and Unicode FAQ for Unix/Linux
Win32-specific issues:
Useful CPAN modules:
- Test::utf8
- utf8::all
- Unicode::Normalize
- (core) Unicode::UCD - interface to the Unicode Character Database
- File::BOM
- File::BOM::Utils
- Text::Unidecode - transliterate Unicode characters to plain 7-bit ASCII
- Unicode::Collate - implementation of Unicode Technical Standard #10 (UTS #10) - Unicode Collation Algorithm (UCA)
- Unicode::Peek - peek at the internal representations of Unicode strings
Fonts:
Scripts and tools:
- CORE: encguess - guess character encodings of files
- Jim: script that counts the number of bytes, code points and graphemes in each UTF-8 encoded word, and also tallies codepoints by Unicode blocks: Re: length() miscounting UTF8 characters?
- Unicode::Tussle: Tom's Unicode Scripts So Life is Easier. uniprops and unichars are particularly useful.
- Unicode Character Finder
- decodeunicode (Unicode 5.0 only)
- ikegami: script to fix up (to the extent possible) UTF-8 encoded text that was wrongly decoded as Windows-1252 and encoded again as UTF8: Re^5: Why does Encode::Repair only correctly fix one of these two tandem characters?
Talks, articles, references, presentations and meditations:
- tchrist on Unicode.
- perlunitut
- why no default unicode?
- A UTF8 round trip with MySQL
- Tom Christiansen: Unicode Support Shootout: The Good, the Bad, & the (mostly) Ugly (PDF)
- Tom Christiansen: Perl Unicode Essentials (archive.org).
- Tim Bray: Characters vs. Bytes
- Joel Spolsky on Unicode
- High End Unicode in Perl 6
Interesting questions and discussions:
Using utf8 in your script proper:
- use utf8;
- Identifiers can contain Unicode, but not arbitrary Unicode characters. See perldata:
If working under the effect of the use utf8; pragma, the following rules apply:
/ (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /xThat is, a "start" character followed by any number of "continue" characters. Perl requires every character in an identifier to also match \w (this prevents some problematic cases); and Perl additionally accepts identfier names beginning with an underscore.
Variables:
- Variable Scoping in Perl: the basics
- Naming variables:
- Andy Lester: The world's two worst variable names
- Bad variable names
WWW:
- Video: HTTP Clients and Perl - Tom Hukins (at YAPC::EU Sofia, August 2014)
XML:
- Book: Perl and XML (April 2002)
- CPAN: XML::LibXML
- CPAN: XML::XSH2 - wrapper around XML::LibXML
- CPAN: XML::Twig - for huge documents that don't fit in memory
- XPath:
Misc.:
The Monastery:
- To send a message to all janitors: /msg janitors ...
- To send a message to all gods: /msg gods ...
- Tidings: What's New at Perlmonks
Experience (XP) and levels:
- Voting/Experience System
- Levels of Monks
- Scribe, Pilgrim, Hermit, Chaplain, Deacon...
- Monk levels in different languages: Translation of the Perlmonks levels ...
- For monks who have attained the level of...
- Level 1: Initiate
- Level 2: Novice
- Level 5: Beadle
- Level 6: Scribe
- Level 9: Friar
- What is moderation?
- How do I moderate?
- What nodes should/should not be FrontPaged?
- Considering Front Paging a Node?
- What is consideration?
- On Responsible Considerations
- Nodes To Consider
- Level 13: Curate
- Level 26: Saint
- Re: PM Leveling Guide. (humorous, yet with a kernel of truth)
- Zen and the art of ignoring XP
History:
- The early history of Perlmonks
- The First Ten Perl Monks - stalkery but interesting
- The True Catacombs of Perlmonks
- A Level Playing Field - 2005 major overhaul of the XP/levels system
- Perl Monks turns 5
Monks:
- Monks by number of writeups: Monks by Writeup Count
- Monks by experience:
- Number of Monks by Level
- Old (early 2006) XP/level statistics: Re: Unused accounts zombified
- Perl Monks Planet
- The meaning of monks' names: Name Space
Orders:
Write-ups:
- Best Nodes
- Selected Best Nodes
- Re: Selected best nodes - nothing recent, why? (Node Reputation Ranges Over the Years) - ongoing monthly stats kept by eyepopslikeamosquito
- Worst Nodes
- Re: Is PM more active or less active than X years ago? - lots of further links
The Chatterbox:
- Perlmonks Fullpage Chat
- last hour of cb
- Chatterbox FAQ
- Chatterbox statistics, by Tanktalus
- Perlmonks CB60 (last hour of CB, refreshes automatically)
- framechat.pl
- Cookies (more info: The Cookies account)
Misc.:
- 1st Monasterians
- Perlmonks Threaded Article Viewer
- Are you addicted to Perlmonks? - not updated any longer
- Top Ten ways you know you are a Perl Monks Addict.
- New ticker login for PM XML clients - hidden setting to keep yourself from showing up in the "Other Users" nodelet (aka "cloaking")
- PM Drinking Game
General infrastructure:
Modules / *PAN:
News etc.:
Perl culture:
General:
Cool JAPHs:
-
- The First Ten Perl Obfus
- The Top Ten Perl Obfus
- Individual obfus:
- Pixel art (Rainbow Dash)
- camel code (Camel)
- spaghetti obfu (Italy)
- find-a-func (Llama)
- There can be only one! (One)
- Undefined JAPH
- an ocean of perl creatures (Jellyfish)
- Snow flake (Snowflake)
- haiku errors (俳句)
Golfing:
- Re: reduce code (Golfing References) - many further links
- Shortening codes (various languages)
The Lighter Side of Perl Culture:
- The Lighter Side of Perl Culture (Part I): Introduction
- The Lighter Side of Perl Culture (Part II): JAPH
- The Lighter Side of Perl Culture (Part III): Obfu
- The Lighter Side of Perl Culture (Part IV): Golf
- The Lighter Side of Perl Culture (Part V): Poetry
- The Lighter Side of Perl Culture (Part VI): April Fools
ACME:: modules:
Misc. (unordered, unsorted):
Due to the 64 KiB node size limit, this section now resides in AppleFritter's scratchpad.
Monk quotes:
Do not fear death, you will re-awaken to a world built with Perfect Perl 7 and no Python.
-- boftx, Re^3: Using die() in methodsthe moment you try to separate the physical construction of code -- kloc, function points, abstracts test quantities -- from the intellectual processes of gathering requirements; understanding work-patterns and flows; and imagining suitable, appropriate, workable algorithms to meet them; you do not have sufficient understanding of the process involved in code development to be making decisions about it.
-- BrowserUk, Re: Nobody Expects the Agile Imposition (Part VII): MetricsYou were unlucky in the sense that your program seems to have remained valid Perl even with all variables removed.
-- Corion, Re: [OneLiner] What am I doing wrong in my regex?I insist on being paid to use Windows products, sir!
-- Your Mother, Re^3: PerlWizard - A free wizard for automatic Perl software code generation using simple formsNo further rational discussion is possible here because I find your preferred style utterly abhorrent :)
-- BrowserUk, Re^3: Porting (old) code to something else
AppleFritter elsewhere:
Two monks sat together for lunch. The first monk said, "What do you see when you see me?" |
Posts by AppleFritter | ||||||||||||||||||||||||||||||||||||||||||||||||
|
(1-8) of 8 |