90 likes | 237 Vues
This document provides a comprehensive overview of filters in UNIX, defining a filter as a program that processes text files by taking input from stdin and producing output to stdout. Key historical insights trace the origin of filters to M. D. McIlroy and the introduction of the vertical bar notation by K. L. Thompson. Practical examples of filters like detab, entab, compress, and simple filters such as head and tail are explored, alongside character translation using the tr command. This guide serves as a foundational resource for effectively utilizing filters in text processing.
E N D
CSC 4630 Meeting 2 January 22, 2007
Filters • Definition: A filter is a program that takes a text file as an input and produces a text file as an output. • UNIX context • Write filters to use stdin as the input file and stdout as the output file. • Use pipe to connect filters. Notation is the vertical bar |
Filter History • Originally conceived by M. D. McIlroy in the early 1970’s • The UNIX notation for pipeline, the vertical bar, was introduced by K. L. Thompson. • “The Unix time-sharing system,” Comm. ACM, July 1974. • “The Unix programming environment,” Software Practice and Experience, January 1979.
Filter Examples • detab -- replaces tab characters in a text file with the appropriate number of space characters • entab -- replaces long strings of space characters with the appropriate number of tab characters • compress – replaces long strings of the same character with a coding for the string and its length
Filter Examples (2) • expand -- reverses the action of the compress filter • translit – in simplest form, takes two argument strings of equal length and changes all occurrences of elements of first string into corresponding elements of second string. • Example: translit abc xyz changes all a’s to x’s , b’s to y’s and c’s to z’s.
Simple Filters: head and tail • Use head to look at the first few lines of a file • Keeps the first n lines of input and discards the rest head [-n] [file] • Use tail to look at the last few lines of a file • Keeps the last n lines of input and discards the rest tail [-n] [file] • Discards the first n-1 lines of input and keeps the rest tail [+n] [file]
tr Command • trtranslates characters • Two arguments, given as strings • Three options, c d s • Often used for letter case conversion • Useful for “cleaning up” formatted output
tr Examples • tr A-Z a-z Converts all upper case letters to lower case • tr –s <tab><space> <space><space> • tr –s ‘\033\010’ ‘\010\010’ • tr –d <CR>^Z • tr –d ‘\015\032’ • tr –cs A-Za-z ‘\012’
tr Examples (2) • tr a-b-d abcd • tr a-c xyz • tr -ac xyz • tr \\ • tr –c \ x • tr \c x