450 likes | 586 Vues
Advanced UNIX. 240-491 Special Topics in Comp. Eng. 1 Semester 2, 2000-2001. Objectives of these slides: a more detailed look at file processing in C. 10. File Processing. Overview. 1. Background 2. Text Files 3. Error Handling 4. Binary Files 5. Direct Access. continued.
 
                
                E N D
Advanced UNIX 240-491 Special Topics in Comp. Eng. 1Semester 2, 2000-2001 • Objectives of these slides: • a more detailed look at file processing in C 10. File Processing
Overview 1. Background 2. Text Files 3. Error Handling 4. Binary Files 5. Direct Access continued
6. Temporary Files 7. Renaming & Removing 8. Character Pushback 9. Buffering 10. Redirecting I/O
1. Background • Two types of file: text, binary • Two access methods: sequential, direct(also called random access) • UNIX I/O is line buffered • input is processed a line at a time • output may not be written to a file immediately until a newline is output
2. Text Files • Standard I/O File I/Oprintf() fprintf()scanf() fscanf()gets() fgets()puts() fputs()getchar() getc()putchar putc() most just add a 'f'
Function Prototypes • int fscanf(FILE *fp, char *format, ...); • int fprintf(FILE *fp, char *format, ...); • int fgets(char *str, int max, FILE *fp); • int fputs(char *str, FILE *fp); • int getc(FILE *fp); • int putc(int ch, FILE *fp); the new argument is the file pointer fp
2.1. Standard FILE* Constants • Name Meaningstdin standard inputstdout standard outputstderr standard error • e.g.if (len >= MAX_LEN)fprintf(stderr, “String is too long\n”);
2.2. Opening / Closing • FILE *fopen(char *filename, char *mode); • int fclose(FILE *fp); • fopen() modes:Mode Meaning“r” read mode“w” write mode“a” append mode
Careful Opening • FILE *fp; /* file pointer */char *fname = “myfile.dat”;if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fname); exit(1);}... /* file opened okay */
2.3. Text I/O • As with standard I/O: • formatted I/O (fprintf, fscanf) • line I/O (fgets, fputs) • character I/O (getc, putc)
2.3.1. Formatted I/O • int fscanf(FILE *fp, char *format, ...); • int fprintf(FILE *fp, char *format, ...); • Both return EOF if an error or end-of-file occurs. • If okay, fscanf() returns the number of bound variables, fprintf() returns the number of output characters.
2.3.2. Line I/O • char *fgets(char *str, int max, FILE *fp); • int fputs(char *str, FILE *fp); • If an error or EOF occurs, fgets() returns NULL, fputs() returns EOF. • If okay, fgets() returns pointer to string, fputs() returns non-negative integer.
Differences between fgets() and gets() • Use of max argument: fgets() reads in at most max-1 chars (so there is room for ‘\0’). • fgets() retains the input ‘\n’ • Deleting the ‘\n’: len1 = strlen(line)-1;if (line[len1] == ‘\n’) /* to be safe */ line[len1] = ‘\0’;
Difference between fputs() and puts() • fputs() does not add a‘\n’ to the output.
Line-by-line Echo #define MAX 100 /* max line length */ :void output_file(char *fname){FILE *fp; char line[MAX]; if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fname); exit(1); } while (fgets(line, MAX, fp) != NULL)fputs(line, stdout);fclose(fp);}
2.3.3. Character I/O • int getc(FILE *fp); • int putc(int ch, FILE *fp); • Both return EOF if an error or end-of-file occurs. • Can also use fgetc() and fputc().
Char-by-char Echo #define MAX 100 /* max line length */ :void output_file(char *fname){ FILE *fp; int ch; if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fname); exit(1); } while ((ch = getc(fp)) != EOF) putc(ch, stdout); fclose(fp);}
Using feof() • Rewrite the previous while-loop as: while (!feof(fp)) { ch = getc(fp); putc(ch, stdout);} • not a common coding style.
3. Error Handling • int ferror(FILE *fp); • check error status of file stream • it returns non-zero if there is an error • void clearerr(FILE *fp); • reset error status continued
common in advanced coding • void perror(char *str); • print str (usually a filename) followed by colon and a system-defined error message • ...fp = fopen(fname, “r”);if (fp == NULL) {perror(fname); exit(1);}
errno • The system error message is based on a system error number (errno) which is set when a library function returns an error. • #include <errno.h>...fp = fopen(fname, “r”);if (errno == ...) ... continued
Many errno integer constants are defined in errno.h • it is better style to use the constant name instead of the number • linux distributions usually put most errno constants in asm/errno.h • Example errno constants:EPERM permission deniedENOENT no such file / directory
4. Binary Files • For storing non-character data • arrays, structs, integers (as bytes), GIFs, compressed data • Not portable across different systems • unless you have cross-platform reading/writing utilities, such as gzip • For portability, use text files
fopen() modes for Binary Files • Mode Meaning“rb” read binary file“wb” write binary file“ab” append to binary file add a "b" to the text file modes
Reading / Writing • int fread(void *buffer, int size, int num, FILE *fp);int fwrite(void *buffer, int size, int num, FILE *fp); • Returns number of things read/written (or EOF).
Example • The code will write to a binary file containing employee records with the following type structure: #define MAX_NAME_LEN 50struct employee { int salary; char name[MAX_NAME_LEN + 1];}; continued
struct employee e1, emps[MAX]; : :/* write the struct to fp */fwrite(&e1, sizeof(struct employee), 1, fp);/* write all of the array with 1 op */fwrite(emps, sizeof(struct employee), MAX, fp);
5. Direct Access • Direct access: move to any record in the binary file and then read (you do not have to read the others before it). • e.g. a move to the 5th employee record would mean a move of size: 4 * sizeof(struct employee) 5th
fopen() Modes for Direct Access (+) • Mode Meaning“rb+” open binary file for read/write“wb+” create/clear binary file for read/write“ab+” open/create binary file for read/write at the end
Employees Example #include <stdio.h>#include <stdlib.h>#include <string.h>#define DF “employees.dat”#define MAX_NAME_LEN 50struct employee { int salary; char name[MAX_NAME_LEN + 1];};int num_emps = 0; /* num of employees in DF */FILE *fp; : Poor style: global variables
e1 e2 e3 e4 Data Format empty space of the right size employees.dat • The basic coding technique is to store the number of employee currently in the file (e.g. 4) • some functions will need this number in order to know where the end of the data occurs . . . . . . . . number
Open the Data File void open_file(void){ if ((fp = fopen(DF, “rb+”)) == NULL) { fp = fopen(DF, “wb+”); /* create file */ num_emps = 0; /* initial num. */ } else /* opened file, read in num. */ fread(&num_emps, sizeof(num_emps), 1, fp);}
Move with fseek() • int fseek(FILE *fp, long offset, int origin); • Movement is specified with a starting position and offset from there. • The current position in the file is indicated with the file position pointer (not the same as fp).
Origin and Offset • fseek() origin values: Name Value MeaningSEEK_SET 0 beginning of fileSEEK_CUR 1 current positionSEEK_END 2 end of file • Offset is a large integer • can be negative (i.e. move backwards) • equals the number of bytes to move
Employees Continued Can write anywhere void put_rec(int posn, struct employee *ep)/* write an employee at position posn */{ long loc; loc = sizeof(num_emps) + ((posn-1)*sizeof(struct employee)); fseek(fp, loc, SEEK_SET);fwrite(ep, sizeof(struct employee), 1,fp);} No checking to avoid over-writing.
Read in an Employee void get_rec(int posn, struct employee *ep)/* read in employee at position posn */{ long loc; loc = sizeof(num_emps) + ((posn-1)*sizeof(struct employee)); fseek(fp, loc, SEEK_SET);fread(ep, sizeof(struct employee), 1,fp);} should really check if ep contains something
Close Employees File void close_file(void){ rewind(fp); /* same as fseek(fp, 0, 0); */ /* update num. of employees */ fwrite(&num_emps, sizeof(num_emps), 1, fp); fclose(fp);}
ftell() • Return current position of the file position pointer (i.e. its offset in bytes from the start of the file): long ftell(FILE *fp);
6. Temporary Files • FILE *tmpfile(void); /* create a temp file */char *tmpnam(char *name); /* create a unique name */ • tmpfile() opens file with “wb+” mode;removed when program exits
7. Renaming & Removing • int rename(char *old_name, char *new_name); • like mv in UNIX • int remove(char *filename); • like rm in UNIX
8. Character Pushback • int ungetc(int ch, FILE *fp); • Overcomes some problems with reading too much • 1 character lookahead can be coded • ungetc() only works once between getc() calls • Cannot pushback EOF
9. Buffering • int fflush(FILE *fp); • e.g. fflush(stdout); • Flush partial lines • overcomes output line buffering • stderr is not buffered.
setbuf() • void setbuf(FILE *fp, char *buffer); • Most common use is to switch off buffering: setbuf(stdout, NULL); • equivalent to fflush() after every output function call
10. Redirecting I/O • FILE *freopen(char *filename, char *mode, FILE *fp); • opens the file with the mode and associates the stream with it • Most common use is to redirect stdin, stdout, stderr to mean the file • It is better style (usually) to use I/O redirection at the UNIX level. continued
FILE *in;int n;in = freopen("infile", "r", stdin);if (in == NULL) { perror("infile"); exit(1);}scanf("%d", &n); /* read from infile */ :fclose(in);