1 / 35

File Processing

File Processing. Introduction. More UNIX commands for handling files Regular Expressions and Searching files Redirection and pipes Bash facilities. In UNIX everything is a file!. Directories and programs are all files! Devices (keyboard, mouse, screen, memory, hard disks, etc) are files

damon
Télécharger la présentation

File Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File Processing

  2. Introduction • More UNIX commands for handling files • Regular Expressions and Searching files • Redirection and pipes • Bash facilities

  3. In UNIX everything is a file! • Directories and programs are all files! • Devices (keyboard, mouse, screen, memory, hard disks, etc) are files • Input and Output channels are read and written like files. • All of these things can be manipulated like files

  4. More UNIX commands

  5. Review of commands seen so far • who, date, finger, passwd • man • pwd, cd, mkdir, rmdir • cp, mv, rm, ls • cat, more, less, head, tail • lpr • chmod, umask

  6. Getting Information about Files • file gives the content types of the specified files (e.g., text, binary, directory, program) file /bin/* • wc counts the number of words, lines and characters in a file -l for lines -c for characters -w for words with no argument, it reads from the keyboard

  7. Finding Files • find searches in a directory hierachy • find<starting directory>-name<filename>-print $ find /usr/share/doc/ -name 'post*' -print /usr/share/doc/postgresql-7.4.13 /usr/share/doc/postgresql-7.4.13/html/postmaster-shutdown.html /usr/share/doc/postgresql-7.4.13/html/postmaster-start.html $ $ find . -name '*.txt' –print (recently can do without print) • which will find things which are on your PATH • which <command>shows which <command> would be executed if we typed command. $ which tail /usr/bin/tail

  8. Sorting and Comparing Files • sort • prints out lines of a file sorted into alphabetical order • can sort on fields within lines • can sort numerical entries (-n) • flags to remove duplicates, reverse sort, etc. • cmp • tests whether two files are identical and reports position of first character where they are (shows 0 if they are identical)

  9. Sorting and Comparing Files (2) • comm • gives three column output of lines in first, but not second; in second, but not first; and in both • diff-c • gives the differences with 4 or 5 lines either side to show context

  10. Regular Expressions and Searching Inside Files

  11. Searching Inside Files • grep pattern <pathname…> searches the specified files for the specified pattern and prints out all lines that contain it, e.g.: grep “that” poem will print every line in poem containing the word that

  12. Regular Expressions • grep “That” poem will only find the string “That” in poem if it has an upper case ‘T’ followed by lower case ‘hat’ • Regular expressions are much more powerful notation for matching many different text fragments with a single expression • i.e. could wish to find “That”, “that”, “tHaT”, etc.

  13. Regular Expressions (2) • Search expressions can be very complex and several characters have special meanings • to insist that That matches only at the start of the line use grep “^That” poem • to insist that it matches only at the end use grep “That$” poem • a dot matches any single character so that grep “c.t” poem matches cat, cbt, cct, etc.

  14. Regular Expressions (3) • Square brackets allow alternatives: • grep “[Tt]hat$” poem • An asterisk allows zero or more repetitions of the preceding match • grep “^-*$” poem for lines with only -’s or empty • grep “^--*$” poem for lines with only -’s and at least one - • grep “Bengal.*Sumatra” poem for lines with Bengal followed sometime later by Sumatra • Many flags to: • display only number of matching lines, ignore case, precede each line by its number on the file and so forth

  15. Redirection and Pipes Connecting all the tools!

  16. Input and Output in UNIX • UNIX considers input and output to programs to be “streams of data” • Could be from/to the user • Could be from/to a file • Could be from/to another program

  17. Redirecting Input and Output • Input and output need not only involve the keyboard and screen, it is possible to redirect them to and from files • Each UNIX command has at least one input channel and two output channels: • STDIN (0) Input channel • STDOUT (1) Output channel • STDERR (2) Output channel • More input and output are usually created by commands to read and write from files that are specified in arguments

  18. STDIN • Stands for standard input • This is where programs expect to find their input • STDIN is set by default to read input from the keyboard • If you want to read the input from a file instead, use <

  19. STDOUT • Stands for standard output • This is where programs (usually) write any output they generate. By default STDOUT appears in your terminal window • If you want to save the output to a file instead, use > • The file will be created • If the file already exists then it will be overwritten • You can also use >> which appends the output onto a file’s contents

  20. STDERR • Stderr stands for standard error • This is where programs usually write error messages. So even if you are redirecting the normal output to a file, you can still see error messages on the screen • You can redirect STDERR using 2>

  21. Redirecting Input and Output (2) input files By default, UNIX attaches STDIN to the keyboard and STDOUT and STDERR to the screen stdout stdin stderr output files

  22. Redirecting Input and Output (3) • Use > to redirect STDOUT to a file ls mydir > temp temp will be created or overwritten to contain the normal output of the ls command, although error messages still go to the screen • Use 2>&1 to redirect both outputs ls mydir > temp 2>&1 • >> is like > except that it appends to an existing file instead of overwriting it ls anotherdir >> temp • < redirects the standard input

  23. Piping • Like redirection except that it attaches input and output to other commands instead of files • User can build a pipeline of connected commands each of which operates on the output of the one before • This is why so many commands take input from stdin when no files are given as arguments (e.g., cat, more, sort, grep, wc) • Uses the pipe symbol, ‘|’

  24. (stdout) more keyboard (stdin) ls (stderr) (stderr) screen Redirecting Input and Output with Pipes • ls | more gives paged output from an ls • What about: who | grep zlizmj who | grep zlizmj | wc -l who | sort file /etc/* | grep “ascii” • Complex pipes can be saved permanently as shell scripts or aliases screen

  25. Hidden Files (1) • Files whose names start with a dot do not show up in a straight ls command • Instead use the -a flag, (i.e., ls -a) • These are often special files for configuring the system or different applications .login .bash_logout .bashrc .profile .bash_profile

  26. Hidden Files (2) • You can permanently customise your environment by editing your .profile • Once you’ve edited it you can apply your changes immediately • Type source .profile (source reads and executes a file) OR • Log out and log in again. The commands in .profile are executed every time you log in

  27. UNIX Shell • The UNIX command line interface is called the ‘shell’ • There are many different shells, for example csh, bash, tsh, and usually you will run only one type of shell in a login session • Different types of shell have different built-in commands, although the core commands are common

  28. Review of Lecture 1 • Editing the command line • DELETE or back space to delete the last character (also ^H) • ^D to delete the next character • ^W to delete the last word • Alt-U to delete the entire line • ^C to interrupt most commands • ^A and ^E to go to the beginning or end of the line (^X means press the Control key and X at the same time)

  29. Bash facilities • There is a history of previously entered commands (called events) that you can see with the command history • You can recall and modify these with !! Previous event again !! string previous event with string added !n event number n !-n the nth previous event !prefix last event that began with prefix !* all the arguments of the last event !^,!$ first and last arguments of last event !:5 fifth argument of last event many more see bash manual The most useful of them all: ^R • ^R searches interactively in the history. • Press enter when you found the one you like (or right arrow to edit it)

  30. Aliases (bash) • To define shorthand for complex commands • alias name definition defines an alias alias hist=history alias ls='ls -F' • alias alone shows you current aliases • unalias name removes an alias • unalias –a removes an alias

  31. The .bash_profile file • Whenever a login shell starts up it executes this file. Can be used to automatically create aliases and set history length. Could contain: set HISTSIZE=200 alias h=history alias print='lpr -Pmyprint'

  32. Directory stacks (bash) • Lets you remember old directories when you change to new ones • pushd puts a new directory on top of the stack • popd removes it and goes back to the previous one • dirs shows the stack

  33. / a x b c y z /a/b /x/y /x/y /x/y cd /x/y popd pushd /a/b Directory stacks (bash) (2)

  34. More UNIX … • UNIXhelp for Users: http://unixhelp.ed.ac.uk/ • CERN Unix users guide: http://consult.cern.ch/writeup/unixguide/unix_2.html • Commonly Used Unix Commands: http://infohost.nmt.edu/tcc/help/unix/unix_cmd.html • Unix Fundamentals: http://infohost.nmt.edu/tcc/help/unix/fund.html • Unix 101: http://www.ugu.com/sui/ugu/show?I=help.articles.unix101 • Introduction to Unix Systems Administration: http://8help.osu.edu/wks/sysadm_course/html/sysadm-1.html

  35. Summary • More UNIX commands • Redirection and Pipes • Bash facilities

More Related