Practice Exercise: Text Manipulation with sed and awk

Objective

Practice text manipulation techniques using the sed and awk commands in a Linux environment.

Task 1: Introduction to `sed`

Open a terminal window.

Let's use the vi_journal.txt file created earlier with the content.

vi, a powerful and versatile text editor, is renowned for its efficient search and replace functionality. With vi, you can effortlessly locate specific words or phrases within your document using its robust search feature, and then replace them seamlessly with new content using the replace command. Whether you're editing code, configuring files, or crafting documents, vi's search and replace capabilities empower you to make precise and rapid changes, enhancing your productivity and control over your text editing tasks.

Use sed to replace a word or phrase in the vi_journal.txt file with another word. Save the result as modified.txt.

[intern@intern-a1t-inf-lnx1 ~]$ sed 's/vi/emacs/g' vi_journal.txt > modified.txt
[intern@intern-a1t-inf-lnx1 ~]$ cat modified.txt
emacs, a powerful and versatile text editor, is renowned for its efficient search and replace functionality. With emacs, you can effortlessly locate specific words or phrases within your document using its robust search feature, and then replace them seamlessly with new content using the replace command. Whether you're editing code, configuring files, or crafting documents, emacs's search and replace capabilities empower you to make precise and rapid changes, enhancing your productiemacsty and control over your text editing tasks.

Task 2: Advanced `sed` Usage

Create a new text file named data.txt with several lines of data.
Use sed to perform the following tasks:
Delete specific lines containing a certain pattern from data.txt.
Replace text in data.txt using regular expressions.

Append new text to the end of lines in data.txt.

[intern@intern-a1t-inf-lnx1 ~]$ cat data.txt
This is line 1.
This is line 2 with some patterns.
This is line 3.
Pattern ABC should be removed from this line.
This is line 5 with another pattern.
Replace XYZ with ZZZ in this line.
This is line 7.
[intern@intern-a1t-inf-lnx1 ~]$ sed '/ABC/d' data.txt > data2.txt
[intern@intern-a1t-inf-lnx1 ~]$ cat data2.txt
This is line 1.
This is line 2 with some patterns.
This is line 3.
This is line 5 with another pattern.
Replace XYZ with ZZZ in this line.
This is line 7.
[intern@intern-a1t-inf-lnx1 ~]$ sed 's/XYZ/ZZZ/g' data2.txt > data3.txt
[intern@intern-a1t-inf-lnx1 ~]$ cat data3.txt
This is line 1.
This is line 2 with some patterns.
This is line 3.
This is line 5 with another pattern.
Replace ZZZ with ZZZ in this line.
This is line 7
[intern@intern-a1t-inf-lnx1 ~]$ echo 'Appending a new line' | sed '$a\' >> data3.txt
[intern@intern-a1t-inf-lnx1 ~]$ cat data3.txt
This is line 1.
This is line 2 with some patterns.
This is line 3.
This is line 5 with another pattern.
Replace ZZZ with ZZZ in this line.
This is line 7.
Appending a new line

Task 3: Introduction to `awk`

Create a sample CSV file named sales.csv with some sample sales data (e.g., product, quantity, price).
Use awk to:
Calculate the total sales for each product.

Find the product with the highest sales.

[intern@intern-a1t-inf-lnx1 ~]$ cat sales.csv
Device,Quantity,Price
Smartphone-A,10,599.99
Tablet,12,399.99
Smartphone-B,8,599.99
Laptop,6,999.99
Headphones-A,3,199.99
Headphones-B,7,159.99
[intern@intern-a1t-inf-lnx1 ~]$ awk -F',' 'NR > 1 {sales[$1] += $2 * $3} END {for (product in sales) print product, sales[product]}' sales.csv
Tablet 4799.88
Smartphone-A 5999.9
Smartphone-B 4799.92
Headphones-A 599.97
Headphones-B 1119.93
Laptop 5999.94

The command above might needs a little bit of explanation
The -F',' sets the delimiter to be , instead of the default space
The NR > 1 makes it so that we skip the header
The {sales[$1] += $2 * $3} multiplies the quantity and price then saves it to the sales array with the product as key
Basically sales[Tablet] += Quantity * Price
We used += incase that there are product that have been listed twice. But in this case since we don't have duplicate product in the data = will also do.

END means to only do the succeeding command when all the lines have been processed. In this casee loop through the array and print it

[intern@intern-a1t-inf-lnx1 ~]$ awk -F ',' 'NR > 1 {sales[$1] += $2 * $3} END {max_sales = 0; max_product = "";
for (product in sales) {
  if (sales[product] > max_sales) {
    max_sales = sales[product];
    max_product = product;
  }
}
print "Product with the highest sales:", max_product, "Total Sales:", max_sales
}' sales.csv
Product with the highest sales: Laptop Total Sales: 5999.94

The only difference of this command from the previous one is the last command. This times it loops through the array and store the max to a variable and compare it to the next one and store whichever is higher.

Task 4: Advanced `awk` Usage

Create a text file named grades.txt containing student names and their corresponding grades.
Use awk to:
Calculate the average grade.
Find students who scored below a certain grade.

Display the student with the highest grade.

# Create the grades.txt file with student names and grades
[intern@intern-a1t-inf-lnx1 ~]$ cat grades.txt
Alice 92
Bob 85
Charlie 78
David 95
Eve 88
Frank 72
Grace 96
Hank 64
Ivy 90
Jack 89

# Calculate the average grade using awk
[intern@intern-a1t-inf-lnx1 ~]$ awk '{ total += $2 } END { average = total / NR; print "Average Grade:", average }' grades.txt

# Find students who scored below a certain grade (e.g., below 80)
[intern@intern-a1t-inf-lnx1 ~]$ awk '$2 < 80 { print $1, "Scored Below 80" }' grades.txt

# Display the student with the highest grade
[intern@intern-a1t-inf-lnx1 ~]$ awk 'NR == 1 { max_grade = $2; top_student = $1 } $2 > max_grade { max_grade = $2; top_student = $1 } END { print "Top Student:", top_student, "Grade:", max_grade }' grades.txt

Conclusion

In this lab exercise, you've practiced text manipulation using the sed and awk commands in a Linux environment. These commands are powerful tools for processing and transforming text data. You've learned how to perform basic and advanced tasks such as find and replace, pattern matching, and data analysis. These skills are valuable for tasks like log file processing, data cleaning, and data analysis in a Linux environment.

Practice Exercise: Text Manipulation with sed and awk

Objective

Task 1: Introduction to sed

Task 2: Advanced sed Usage

Task 3: Introduction to awk

Task 4: Advanced awk Usage

Conclusion

Task 1: Introduction to `sed`

Task 2: Advanced `sed` Usage

Task 3: Introduction to `awk`

Task 4: Advanced `awk` Usage