Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Is there a good way to unify text files something like dos2nix shell script(s) do?

by 2teez (Priest)
on Dec 20, 2013 at 05:21 UTC ( #1067917=note: print w/ replies, xml ) Need Help??


in reply to Is there a good way to unify text files something like dos2nix shell script(s) do?

Hi taint,
I know on linux box, one can use the command split to break files specifying how many megabytes you want like split -b 1024m file_name, then use the command cat to assemble them together, like cat bfile* > file_name
I know it has been used for several file types.
I don't know if that is what you are looking for or if windows has something similar to these.
Using man split and man cat show it's usage.

If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me


Comment on Re: Is there a good way to unify text files something like dos2nix shell script(s) do?
Select or Download Code
Re^2: Is there a good way to unify text files something like dos2nix shell script(s) do?
by taint (Chaplain) on Dec 20, 2013 at 05:44 UTC

    Thanks for the reply, 2teez.

    I'm also on a *NIX box (FreeBSD). It looks like I may not have used the best word to describe my ultimate goal (unify). What my ultimate goal is. Is to parse files recursively, and based on their format (iso-*-*, line endings, perhaps trailing spaces) unify them, in the sense that they are all the same in those respects. Ultimately (for me) utf-8, *NIX line endings, with no trailing spaces. I don't have a lot of difficulty making the conversions, so much as I have "tasting" the file before hand. So as to convert it w/o buggering it up. For example, a file in a different (spoken) language that isn't already utf-8. Knowing in advance, what it is, and converting it to utf-8 can be tricky. Even tho I know Perl is pretty good at it.

    I'm still searching, and while I haven't found a complete solution. I did find a couple of interesting Text::Filter Modules that may help in cobbling something up. In fact, their pretty nice general purpose Filters for a lot of things: Text::Filter, and Text::Filter::Chain. If I don't use them for this project. I can sure think of a lot of other things to use tham with. :)

    Thanks again, 2teez, for the reply.

    --Chris

    Yes. What say about me, is true.
    
      "It looks like I may not have used the best word to describe my ultimate goal". No surprises there then.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1067917]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (10)
As of 2014-11-23 11:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (130 votes), past polls