Well... defining what a real thread is kind of confusing.
Actually, it's not. A "thread" is a schedulable unit of execution context. Thereby making kernel threads like Windows threads and pthreads--as used by ithreads--real threads. (The 'real' is redundant.)
It also makes some user-space implementations--such as found in Java 1.1, Erlang, and others--that implement their own internal scheduler, also threads.
But coroutines are not threads. They are coroutines.
I think Coro is really neat
I also think Coro is extremely clever code. And its author, an extremely clever coder. There have even been a few occasions when I have sorely wished that Coro ran on my platform. There is no reason it shouldn't. The basic, underlying longjump mechanism works natively just fine--it is used for exception handling. It's just the implementation that prevents it.
And my recognition of the author's skills and knowledge are what makes me think that his diatribe in the Coro POD, is neither ignorance nor confusion> But simple politicking of the worst kind. Done in the full knowledge and aforethought of malice, that it is both factually incorrect, and likely to lead some--like binary perhaps--into confusion.
I think that if there is any real confusion, it comes because Linux treats threads and processes very similarly. To the extent that some versions of top actually list the threads of a single process as if they were separate processes.
To quote
Threads of execution, often shortened to threads, are the objects of activity within the process. Each thread includes a unique program counter, process stack, and set of processor registers. The kernel schedules individual threads, not processes. In traditional Unix systems, each process consists of one thread. In modern systems, however, multithreaded programs—those that consist of more than one thread—are common. As you will see later, Linux has a unique implementation of threads: It does not differentiate between threads and processes. To Linux, a thread is just a special kind of process.
The thing that makes them "special", is that they share address space. Perl's threads also share address space at the C level.
It is the programming model that ithreads layers on top of those underlying kernel threads, that restricts the access of individual threads within the process, to subsets of the full memory allocated to that process.
It does this by segregating memory allocations made by different threads, to different segments ("arenas") of the memory allocated to the process. But it is only Perl and the threading model chosen, that enforces this segregation; not the OS. Indeed, the segregation is quite easily defeated.
The choice of an 'explicitly-shared only' model was a) a concious choice; b) done with very good reason.
And IMO c) will in the longer term be seen as both inspired, and "the way to go".
The current implementation lets it down somewhat because of its memory-hungriness, and (lack of) speed. But this could (and hopefully, soon will be) addressed. The main problem with the current implementation is that is uses a 'double-tieing' mechanism for the scalars held in shared aggregate structures.
That is to say, both the AV or HV of a shared structure, and the individual scalars they contain, have attached magic. This means that not only is the size of every aggregate-held scalar, inflated in size by the attached magic, but also that each thread that has visibility of the shared structure, also requires a--relatively lightweight, but still significant--place-holder or alias object to every scalar held in the shared structure. This is both quite costly--and unnecessary.
The scalars that live within a tied aggregate don't need to have individually attached magic. (Nor even any physical storage allocation, but that's a twist that we can skip for now.) When a FETCH or STORE is invoked upon a tied array ot hash, the magic attached to the AV or HV has enough information to read or write the actual element without requiring further magic be attached to each individual scalar.
Not only would the removal (or rather the avoidance of attachment) of magic to the individual scalars considerably lessen the size of the shared aggregates, it would also remove the need for per-thread place-holders for them also. So, each thread would retain a single, lightweight reference to the shared AV or HV, and access it contents through that via it's attached magic, with the result that the memory cost of the shared aggregate is further reduced.
The final icing on the cake is that indirecting through only one level of magic instead of two would considerably speed up accesses.
In a nutshell, you can wrap a class around an aggregate with having to make the individual elements of that aggregate objects in their own right. And the memory and performance saving of that are legion. And this could (and will if I ever master the intricacies of XS) be implemented now.
But none of this detracts from either the desirability of preventing the unintentional, accidental sharing of thread-specific data; nor the usability of the current implementation. Just as with regexes (and every other aspect of Perl, and other languages), implementations can be improved, incrementally over time. Provided that the basic programming model is right.
And (IMO) the ithreads model is.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.