Altering the contents of a file using sed

On Unix, Linux, and OS X systems, the sed (stream editor) utility can be used to modify the contents of a file replacing one string, i.e., sequence of characters, with another. E.g., suppose the file named myfile contains the following lines:

Learn GREP and SED on Linux for Beginners
Learn GREP and SED on Linux for Beginners
1x1 px

pink blue
red Blue
orange
blue purple blue
blue

If I want to replace all occurrences of the word "blue" with "green", I could issue the following sed command at a Bash shell prompt.

$ sed -i -e 's/blue/green/g' myfile

The -i option indicates the file should be edited in place. I.e., the changes are made within the file itself rather than a new output file being created.

The -e indicates that what follows will be the expression/command for sed to run.

Within the single quotes is the expression for sed to evaluate. The s tells sed I want to perform a subsitution operation. The parts of the substitution are enlosed within the forward slashes (/). The string that follows the first shash, i.e., blue, is the string for which sed should search. The string that follows the second slash is the replacement string, i.e. green, which is terminated by another slash. I don't necessarily have to use the slash character as the separator. E.g., I could use a colon (:), instead, i.e., sed -i -e 's:blue:green:g' myfile and achieve the same effect.

The g after the last slash tells sed that I want to replace all instances of the first string with the second string, not just the first instance it finds. If I used sed -i -e 's/blue/green/' myfile without the g after the last slash, sed would only replace the first instance of "blue" that it found with "green".

After I execute the command sed -i -e 's/blue/green/g' myfile the file would then contain the lines below:

pink green
red Blue
orange
green purple green
green

All occurrences of the word "blue" were replaced with "green". "Blue" was not replaced because the search is case sensitive, so a capital "B" is not the same as a lower-case "b". The tool supports regular expressions, so, if I wanted instances of "Blue" replaced with "green", I could use the following command, instead.

sed -i -e 's/[bB]lue/green/g' myfile

If I enclose letters within brackets, that is a regular expression indicating to sed that I want it to look for any instances of either a "b" or a "B" followed by "lue" with "green". If I ran that command on the original file, I would then have the following lines in it.

pink green
red green
orange
green purple green
green

I could enclose more than just two characters within the brackets. E.g., if I also wanted sed to replace any instances of "clue" or "Clue" with "green" as well, I could use the command below.

$ sed -i -e 's/[bBcC]lue/green/g' myfile

Alternatively, if I wanted sed to ignore the case of any characters in the search string, e.g., if I wanted it to replace "blue", "Blue", "BLUE", etc., I could just put a I after the last separator. E.g., I could use sed -i -e 's/blue/green/Ig' myfile. The search would then be case-insensitive and global replacing any variations in capitalization of the word "blue" with "green" even if the word occurred multiple times on the line. If I only wanted the first occurrence of the word to be replaced, I would omit the "g" from /Ig.

If I wanted to create a Bash script that any user could run that would prompt the user for text to be replaced in a file, the replacement text, and file name, I could put the following lines in the script, which I will name replace_text.

#!/usr/bin/bash
read -p "File name: " filename
read -p "Text to replace: " replace
read -p "Replacement text: " replacement
sed -i -e 's/$replace/$replacement/g $filename

The read command is explained in section "8.2.1 Using the read built-in command" of the Bash Guide for Beginners.

To make the script executable, I could change the file permissions to give the owner of the file, any accounts in the same group as the owner, and all accounts on the system as well execute permission with the command chmod 775.

$ ls -l replace_text
-rw-rw-r--. 1 ann ann 171 Jan  8 12:49 replace_text
$ chmod 775 replace_text
$ ls -l replace_text
-rwxrwxr-x. 1 ann ann 171 Jan  8 12:49 replace_text
$

Then when the script is executed, it will prompt the user running it for the search and replacement strings and the file on which the command should operate.

$ ./replace_text
File name: myfile
Text to replace: blue
Replacement text: green
$

Rather than having sed take its input from a file, you can also pipe input to it from another command. E.g.:

$ echo "My favorite color is blue." | sed -e 's/blue/green/'
My favorite color is green.
$

That isn't particularly useful in the simple example above, but that can be useful in Bash scripts. E.g., I have some HTML files that I need to alter to pass the W3C Markup Validation check for compliance with HTML standards, so I include sed commands to do so in the following script.

Learn GREP and SED on Linux for Beginners - Lite
Learn GREP and SED on Linux for Beginners - Lite
1x1 px

#!/usr/bin/bash

if [ -z "$1" ]; then
   read -p "File name: " filename
else
   filename=$1
fi
if [ -z "$2" ]; then
   read -p "Link Name: " linkname
else
   linkname=$2
fi

sed -i -e 's/^/<p style="margin: 0px; text-align: center;">\n/' $filename
sed -i -e 's/&/&amp;/g' $filename
sed -i -e 's/border=0/style="border:0;"/g' $filename
sed -i -e 's/subid=0" >/subid=0" alt="1x1 px">\n<\/p>/' $filename
sed -i -e "s/<IMG /\n<IMG alt=\"$linkname\" /" $filename
sed -i -e "s/<\/a>/<br>\n$linkname<\/a>\n/" $filename
newfilename=$(echo "$filename" |  sed -e 's/480x270/240x135/')
cp $filename $newfilename
sed -i -e 's/<IMG /<IMG width="240" height="135" /' $newfilename

The script, named updatefile can accept the parameters it uses, a file name (filename) and a link name (linkname), as arguments given on the command line, e.g., updatefile myfile "Linux Tips". The script checks to determine if the parameters are given on the command line and, if not, prompts for them - the first parameter, $1, should be the file name and the second parameter, $2, should be the descriptive text for the link, which will be assigned to the variable "linkname".

In the first sed line, sed -i -e 's/^/<p style="margin: 0px; text-align: center;">\n/' $filename, the string I want to replace is represented by the caret (^) symbol. In regular expressions, a ^ represents the beginning of the line while a dollar sign ($) represents the end of the line. So you can use a caret as the search string for sed and then put in text you want to insert at the beginning of a line. The \n signifies that a newline should be inserted after that text, which is like hitting the Enter key when you are editing a file to start a new line. So <p style="margin: 0px; text-align: center;"> will be inserted as the first line in the file before any existing text in the file.

The next three sed lines perform some text substitution in the file. The first replaces any occurences of an ampersand character with the HTML code for an ampersand, i.e., &amp;. The second sed command makes the code HTML5 compliant by replacing instances of border=0 with style="border:0;" because the former code, which was acceptable with HTML 4, was deprecated in HTML5. The next line inserts an "alt" tag for one image in the file that is only 1 pixel wide by 1 pixel high, since "img" tags should have a text description that a browser can show instead of the image, if the image can't be displayed for any reason. E.g., there are still text browsers, such as Lynx. The link for that particular image has subid=0 in the URL followed by a double quote, so that is what I have sed search on. I place the closing </p> paragraph tag after the alt tag, since that is the last element in the file.

The next sed line inserts the linkname variable between the double quotes in alt="". Either single quotes or double quotes can be used to enclose the expression that the sed command will evaluate and process. In the first 3 sed lines, I used single quotes. But in the fourth one that inserts the link name text between the double quotes, I need to use double quotes. If single quotes are used, sed will assume that $linkname is to be inserted just as it appears, i.e., it won't evaluate the variable and place its value between the double quotes. But, if I enlose the expression that sed will evaluate, i.e., the part that appears after the -e, I need to use double quotes for that expression rather than single quotes to get sed to put the text for linkname that I provided on the command line or when prompted for linkname by the script. But when I surround the expression in double quotes, but have double quotes within the expression itself, I have another problem. I need to have sed ignore the meaning it might otherwise give to those double quotes, i.e., the first one it sees would appear to it to match the prior double quote before the expression. I can do so by placing an escape character before any double quote that I want it to regard as just another character like "a", "b", "c", etc. The escape character is the backslash (\) character, which indicates that the character immediately following the backslash is to be treated normally rather than to be viewed as a special character. So I use alt=\"$linkname\" with a backslash before each double quote. I also needed a backslash character before the slash character in the prior line for the closing paragraph tag, i.e., <\/p> for the same reason.

If you want sed to insert a newline character, i.e., to start another line at a point in the text, you can place a \n at the point where you want the new line to start which is what I do on the following line:

sed -i -e "s/<\/a>/<br>\n$linkname<\/a>\n/" $filename

That line inserts the HTML <br> break tag in the text and then breaks the line at that point as well, by inserting a newline character, which makes the text easier for me to read.

The file names I'm working with all terminate with "480x270" which are the dimensions of an image linked to by the file. I want to edit the original file, but also create a new file that will display the image at 1/2 the original dimensions, i.e., 240 pixels wide by 135 pixels high. So I pipe the value of the filename variable to the sed command using the echo command and, using sed, replace the "480x270" at the end of the file name with "240x135". I use command substituion to assign the output of those commands to the variable newfilename by placing $() around the echo and sed commands.

newfilename=$(echo "$filename" |  sed -e 's/480x270/240x135/')

I can then copy the 480x270 file to one with the same name except with the "480x270" at the end of the filename replaced with "240x"135". I can then edit the new file and add width="240" height="135" to the IMG tag for the image, so though the image is still 480 pixels wide by 240 pixels high, browsers will display the image as one 240 pixels wide and 135 pixels high.

sed -i -e 's/<IMG /<IMG width="240" height="135" /' $newfilename

So then I can run the script from a shell prompt with a command similar to the one below and have sed perform all of the substitutions I need in the input file and then copy that file to another similarly named file, but with the "480x240" part of the file name replaced with "240x135" and then have sed make the appropriate alteration to the contents of the new file.

./updatefile Learn_GREP_SED_on_Linux_for_Beginners_480x270 "Learn GREP and SED on Linux for Beginners"

Related articles:

  1. How to get cat to process a file name provided in the output of another command
    Created: Friday June 10, 2016
    Last modified: Friday June 10, 2016