1 / 100

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING. Jehan-François Pâris jfparis@uh.edu Fall 2016. THE ONLINE BOOK CHAPTER XI FILES. Chapter Overview. We will learn how to read, create and modify files Essential if we want to store our program inputs and results.

fordyce
Télécharger la présentation

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 1306COMPUTER SCIENCE AND PROGRAMMING Jehan-François Pâris jfparis@uh.edu Fall 2016

  2. THE ONLINE BOOKCHAPTER XIFILES

  3. Chapter Overview We will learn how to read, create and modify files Essential if we want to store our program inputs and results. Pay special attention to pickled files They are very easy to use!

  4. Accessing file contents Two step process: First we open the file Then we access its contents read write When we are done, we close the file

  5. What happens at open() time? The system verifies That you are an authorized user That you have the right permission Read permission Write permission Execute permission exists but doesn’t apply and returns a file handle /file descriptor

  6. The file handle Gives the user Fast direct access to the file No folder lookups Authority to execute the file operations whose permissions have been requested

  7. Python open() open(name, mode = 'r', buffering = -1)where nameis name of file modeis permission requested Default is'r'for read only bufferingspecifies thebuffer size Use system default value (code -1)

  8. The modes Can request 'r' for read-only 'w' for write-only Always overwrites the file 'a' for append Writes at the end 'r+' or 'a+' for updating (read + write/append)

  9. Examples f1 = open("myfile.txt") same asf1 = open("myfile.txt", "r") f2 = open("test\\sample.txt", "r") f3 = open("test/sample.txt", "r") f4 = open("C:\\Users\\Jehan-Francois Paris\\Documents\\Courses\\1306\\Python\\myfile.txt")

  10. The file system Provides long term storage of information. Will store data in stable storage (disk) Cannot be RAM because: Dynamic RAM loses its contents when powered off Static RAMis too expensive System crashes can corrupt contents of the main memory

  11. Overall organization Data managed by the file system are grouped in user-defined data sets called files The file system must provide a mechanism for naming these data Each file system has its own set of conventions All modern operating systems use a hierarchical directory structure

  12. Windows solution Each device and each disk partition is identified by a letter A: and B: were used by the floppy drives C: is the first disk partition of the hard drive If hard drive has no other disk partition,D: denotes the DVD drive Each device and each disk partition has its own hierarchy of folders

  13. Windows solution Second diskD: Flash driveF: C: Windows Users Program Files

  14. Linux organization Inherited from Unix Each device and disk partition has its own directory tree Disk partitions are glued together through theoperation to form a single tree Typical user does not know where her files are stored Uses "/" as a separator

  15. UNIX/LINUX organization Root partition / Other partition usr The magicmount bin Second partition can be accessed as /usr

  16. Mac OS organization Similar to Windows Disk partitions are not merged Represented by separate icons on the desktop

  17. Accessing a file (I) Your Python programs are stored in a folder AKA directory On my home PC it is C:\Users\Jehan-Francois Paris\Documents\Courses\1306\Python All files in that folder can be directly accessed through their names "myfile.txt"

  18. The root Users J.-F. Paris Documents Courses\1306\Python\x.txt Courses 1306\Python\x.txt 1306 Python\x.txt Python x.txt

  19. Accessing a file (II) Files in folders inside that folder—subfolders—can be accessed by specifying first the subfolder Windows style: "test\\sample.txt" Note the double backslash Linux/Unix/Mac OS X style: "test/sample.txt" Generally works for Windows

  20. Why the double backslash? The backslash is an escape character in Python Combines with its successor to represent non-printable characters ‘\n’ represents a newline ‘\t’ represents a tab Must use ‘\\’ to represent a plain backslash

  21. Accessing a file (III) For other files, must use full pathname Windows Style: "C:\\Users\\Jehan-Francois Paris\\Documents\\Courses\\1306\\Python\\myfile.txt" Linux and Mac: "/Users/Jehan-Francois Paris/Documents/Courses/1306/Python/myfile.txt"

  22. Reading a file Four ways: Line by line Global reads Within a while loop Also works with other languages Pickled files

  23. Line-by-line reads for line in fh : # special for loop #anything you wantfh.close() # optional

  24. Example f3 = open("test/sample.txt", "r") for line in f3 : print(line)f3.close() # optional

  25. Output To be or not to be that is the questionNow is the winter of our discontent With one or more extra blank lines

  26. Why? Each line ends with newline print(…)adds an extra newline

  27. Trying to remove blank lines print('-----')f5 = open("test/sample.txt", "r") for line in f5 : print(line[:-1]) # remove last charf5.close() # optionalprint('------')

  28. The output ------ To be or not to be that is the questionNow is the winter of our disconten------ The last line did not end with an newline!

  29. A smarter solution (I) Only remove the last character if it is an newline if line[-1] == '\n' : print(line[:-1]else print line

  30. A smarter solution (II) print('-------')fh = open("test/sample.txt", "r")for line in fh : if line[-1] == '\n' : print(line[:-1]) # remove last char else : print(line)print('------')fh.close() # optional

  31. It works! ------ To be or not to be that is the questionNow is the winter of our discontent-------

  32. We can do better • Use the rstrip() Python method • astring.rstrip() remove all trailing spaces from astring • astring.rstrip('\n') remove all trailing newlines from astring

  33. Examples

  34. The simplest solution This will remove all trailing newlines even the ones we should keep print('-------')fh = open("test/sample.txt", "r")for line in fh : print(line.rstrip('\n')print('------')fh.close() # optional

  35. Global reads fh.read() Returns whole contentsof file specified by file handlefh File contents are stored in a single stringthat might be very large

  36. Example f2 = open("test\\sample.txt", "r") bigstring = f2.read()print(bigstring)f2.close() # optional

  37. Output of example To be or not to be that is the questionNow is the winter of our discontent Exact contents of file ‘test\sample.txt’ followed by an extra return

  38. fh.read() and fh.read(n) fh.read() reads in the whole fh file and returns its contents as a single string fh.read(n) reads the next n bytes of file fh

  39. Reading within a loop Standard method for C/C++ infile = open("test sample.txt", "r") line = infile.readline() # priming read while line : # false if empty print(line.rstrip("\n") line = infile.readline() infile.close()

  40. Making sense of file contents Most files contain more than one data item per line COSC 713-743-3350UHPD 713-743-3333 Must split lines mystring.split(sepchar)where sepchar is a separation character returns a list of items

  41. Splitting strings >>> txt = "Four score and seven years ago">>> txt.split()['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker, Andy', 83, 89, 85">>> record.split(',')[' 1', "'Baker", " Andy'", ' 83', ' 89', ' 85'] Not what we wanted!

  42. Example # how2split.py print('-----') fh = open("test/sample.txt", "r") for line in fh : words = line.split() for xxx in words : print(xxx) fh.close() # optional print('-----')

  43. Output Spurious newlines are gone -----Tobe…ofourdiscontent-----

  44. Standard way to access a file # preprocessing # set up counters, strings and lists fh = open("input.txt", "r") for line in fh : words = line.split(sepchar) # often space for xxx in words : # do something fh.close() # optional # postprocessing # print results

  45. Example • List of expenditures with dates: • Rent 11/2/16 $850Latte 11/2/16 $4.50Food 11/2/16 $35.47Latte 11/3/16 $4.50Latte 11/3/16 $4.50Outing 11/4/16 $27.00 • Want to know how much money was spent on latte

  46. First attempt • Read line by line • Will split all lines such as • "Food 11/2/16 $35.47" into • ["Food", "11/2/16", "$35.47"] • Will use first and last entries of each linelist

  47. First attempt total = 0 # set up accumulator fh = open("expenses.txt", "r") for line in fh : words = line.split(" ") if words[0] == 'Latte' : total += words[2] # increment fh.close() # optional print("you spent %.2f on latte" % total) It does not work!

  48. Second attempt Must first remove the offending '$' Must also convert string to float def price2float(s) : """ remove leading dollar sign""" if s[0] == "$" : returns float(s[1:]) else : return float(s)

  49. Second attempt total = 0 # set up accumulator fh = open("expenses.txt", "r") for line in fh : words = line.split(" ") if words[0] == 'Latte' : total += price2float(words[2]) fh.close() # optional print("You spent $%.2f on latte" % total) You spent $13.50 on latte

  50. Picking the right separator (I) Commas CSV Excel format Values are separated by commas Strings are stored without quotes Unless they contain a comma “Doe, Jane”, freshman, 90, 90 Quotes within strings are doubled

More Related