MoonPoint Support Logo

 

Shop Amazon Warehouse Deals - Deep Discounts on Open-box and Used ProductsAmazon Warehouse Deals



Advanced Search
March
Sun Mon Tue Wed Thu Fri Sat
  2
     
2010
Months
Mar


Tue, Mar 02, 2010 4:09 pm

OS X Line Endings

Operating systems handle the line endings in text files in different ways. For DOS and Microsoft Windows, the end of a line is marked by a carriage return (CR) and a line feed (LF) character.

The CR and LF characters were used originally on teletypewriters, aka teleprinters, which were electromechanical typewriters used for telecommunications or to control early computers. Though, later, the carriage return would usually move the paper in the device to the next line as well, initially it would cause the cylinder on which the paper was held (the carriage) to return to the left side of the paper after a line of text had been typed without advancing the paper to a new line. Today, the return key you see on a computer's keyboard is a descendant of the carriage return on the earlier teletype machines. In most word processors today, hitting the return key will move the cursor to the beginning of the next line.

If you are working on a text file, e.g. one with a .txt extension, on a DOS or Microsoft Windows system, when you hit the return key two characters are inserted in the file at that point, a carriage return (CR) character followed by a line feed character, which have the following hexadecimal representations.

DescriptionHex
Carriage Return (CR)0D
Line Feed (LF)0A

But, if you are working on a Linux or Unix system, then only the LF character is inserted at the end of a line when you hit return. This may be due to a desire to reduce disk storage space for text files on early Unix computers; disk storage was much more limited than it is today.

Mac systems use yet another convention with OS X, even though it is a Unix-based operating system, with a heritage in BSD Unix . They use just the CR character to mark the end of a line.

OSNewlineHexadecimal
DOS/WindowsCRLF0D 0A
Linux/UnixLF0A
Mac OS/OS XCR0D

So most Mac applications will, when you save a file as a text file, put just a CR at the end of the line. However, if you are editing a file from the command line on a Mac OS X system with a program, such as Vi, which is an editor that comes with Mac OS X, but which was originally developed for Unix, it will save a file with the LF (hex 0A) character at the end of lines.

E.g., I can create a text file test.txt with vi and put just the following two lines in it:

123
456

If I examine the contents of the file with the od program, I see the following, if I use the -c option to display ASCII characters or backslash escapes:

GS01:Documents jsmith$ od -c test.txt
0000000    1   2   3  \n   4   5   6  \n                                
0000010

The \n at the end of each line represents a newline

But, if I use -ax to see the ASCII and hexadecimal contents of the file, I see the following:

GS01:Documents jsmith$ od -ax test.txt
0000000    1   2   3  nl   4   5   6  nl                                
             3231    0a33    3534    0a36                                
0000010

I see that the lines are terminated with the hexidecimal 0A character for the newline character. Note: the hexadecimal representation that appears below the ASCII representation has the bytes reversed, i.e. 32 represents 2 and 31 represents 1.

If you need to convert a file that uses the Mac style of terminating lines with a CR character to the Linux/Unix style of using a LF character, then you can use the following procedure within vi taken from Using the shell (Terminal) in Mac OS X.

Type "1,$s/" and then press CTRL-V followed by CTRL-M. When you press CTRL-V nothing appears to happen, but the CTRL-M shows up as "^M". Continue with "/" and then CTRL-V again. Hit RETURN (which will show up as ^M and you could do that too - I just like it this way) and finally "/g". On your screen the whole thing looks like:

   :1,$s/^M/^M/g

What does that mean? It means "Starting at line 1 and stopping at the end of the file (1,$), substitute (s) any CTRL-M (/^M/) with Unix CTRL-M (^M/) and do it for the entire line rather than just the first CTRL-M you find (g) (On most other Unixes I'd just do s/^M//g ; I don't know why Mac OS X didn't let me do that). It is a little strange that you replace ^M with ^M but get something entirely different, but that's a subject for another day. The morbidly curious can start by typing "man stty" if they need to know now.

You can then use wq to save the file under the same name or wq newfilename.txt to give the converted version a new name.

Or, alternatively, if you don't want to use the vi editor, you can use the following:

cat file1 | tr "\\r" "\\n" > file2

That will use the translate, i.e. tr, command to translate all instances of the carriage return character, represented by \r to the newline character, in this case the LF character used on Unix systems.

If you wish, you could also create a script, e.g., mac2unix to perform the translation:

test $# -eq 2 -a "$1" != "$2" && tr "\015" "\012" < $1 > $2 || 
echo "Usage: mac2unix f1 f2"

After changing the permissions on the file with chmod 755 mac2unix, you could use mac2unix file1 file to convert the contents of file1 to file2.

I receive email messages from a Unix system that contain gpg encrypted data on a Mac OS X system. If I try to decrypt them with gpg --decrypt file1.gpg >file2.txt on the Mac system, I receive the error message gpg: [don't know]: invalid packet (ctb=53). So I first need to convert file1 with this procedure before running gpg to decrypt it.

If you needed to convert a file on a Mac system to the text format for a DOS or Microsoft Windows system, you could create a script, e.g. mac2dos to perform the conversion:

test $# -eq 2 -a "$1" != "$2" && { mac2unix $1 $2; unix2dos $2 $2 } || echo "Usage: mac2dos f1 f2"

That script would rely on the mac2unix script you created previously.

To go the other way, e.g. from DOS/Windows to the Mac text format or from Unix to the MAC format, you could use the following:

dos2mac

test $# -eq 2 -a "$1" != "$2" && tr -d "\012" < $1 > $2 || echo 
"Usage: dos2mac f1 f2"

unix2mac

test $# -eq 2 -a "$1" != "$2" && tr "\012" "\015" < $1 > $2 || 
echo "Usage: unix2mac f1 f2"

References

  1. Carriage return
    Wikipedia, the free encyclopedia
  2. Newline
    Wikipedia, the free encyclopedia
  3. Teleprinter
    Wikipedia, the free encyclopedia
  4. Using the shell (Terminal) in Mac OS X
    Date: December 2002
    MacOSX articles at APLawrence.com
  5. Vi
    Wikipedia, the free encyclopedia
  6. Line Breaks
    Date: July 1, 2003
    By: Rodney Sparapani/Medical College of Wisconsin
    The ESS-help Archives
  7. Why is the line terminator CR+LF?
    Date: March 18, 2004
    By: oldnewthing
    The Old New Thing

[/os/os-x] permanent link

Valid HTML 4.01 Transitional

Privacy Policy   Contact

Blosxom logo