1 / 67

Introduction to IDL Dealing with Data

Introduction to IDL Dealing with Data. Week 2 Rachel McCrary 3-31-2009. Goals for today!. Part 1 How to open a file for reading and writing How to read/write ASCII or formatted data How to read/write unformatted or binary data How to read NetCDF data files Part 2

alair
Télécharger la présentation

Introduction to IDL Dealing with Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to IDLDealing with Data • Week 2 • Rachel McCrary • 3-31-2009

  2. Goals for today! • Part 1 • How to open a file for reading and writing • How to read/write ASCII or formatted data • How to read/write unformatted or binary data • How to read NetCDF data files • Part 2 • How to plot to the display window • How to use the plot command and modify it • How to plot multiple data sets, multiple axes, and multiple plots on the same page • How to add color to your plots

  3. Opening Files • All input/output in IDL is done with logical unit numbers. • A logical unit number is a pipe or conduit that is connected between IDL and the data file that you want to read from or write to. • There are three types of open commands • openr - Open a file for Reading • openw - Open a file for Writing • openu - Open a file for Updating

  4. Opening Files • The syntax of these three commands is the same: name of the command, logical unit number (lun), file • openw, 20, ‘filename.txt’ • You can use logical a unit numbers directly (and manage them yourself) ... as in the above example. • You can also have IDL manage logical unit numbers by using /GET_LUN • openr, lun, ‘filename.txt’, /GET_LUN • When you are finished with the logical unit numbers you can: • close it using the close command (in the first example) - close, 20 • or free the lun using FREE_LUN - FREE_LUN, lun

  5. Reading/Writing Formatted Data • IDL distinguishes between two types of formatted files with regard to reading and writing data: • Free File Format: uses either commas or whitespace (tabs and spaces) to distinguish each element in the file. It is more informal than an explicit file format. • An explicit format file is formatted according to rules specified in a format statement. The IDL format statement is similar to format statements you might use in FORTRAN.

  6. Writing a Free Format File • Writing a free format file in IDL is easy ... you just use the printf command to write the variables into a file. (This is the same as using print to write variables to the display) • see example: “write_free_format.pro” • IDL puts white space between each element of the array variables and starts each variable on its own line. • IDL uses the 80 column width default, you can use can change this using the width keyword in the openw command

  7. Reading a Free Format File • Many ASCII files are free format. In IDL you want to use the readf command to read free format data from a file. • IDL uses 7 (fairly simple) rules to read free format data

  8. Rule #1 • If reading into a string variable, all characters remaining on the current line are read into the variable. • Example: try to read in the first line of the file we just created called “test_free_format.txt” with the following IDL> OPENR, lun, 'test_free_format.txt', /GET_LUN IDL> header = ''IDL> READF, lun, headerIDL> print, headerTest data file. • You can then read the first two lines (or any number of lines) in a similar way using the following IDL> header = strarr(2)IDL> READF, lun, headerIDL> print, headerTest data file. Created: Sun Mar 29 13:35:52 2009

  9. Rule #2 • Input data must be separated by commas or whitespace (spaces or tabs). • This is exactly the kind of data that is in the test_free_format.txt file (after the header)!

  10. Rules #3, #4 and #5 • Input is performed on scalar variables, Arrays and structures are treated as a collection of scalar variables. • Translation - if the variable you are reading into contains 10 elements, then IDL will try to read 10 separate values from the data file. It will then use rules 4 & 5 to determine where those values are located in the file. • If the current input line is empty, and there are still variables left requiring input, read another line. • If the current input line is not empty, but there are no variables left requiring input, ignore the remainder of the line.

  11. Rule #4 & #5 IDL> data = fltarr(8)IDL> READF, lun, dataIDL> print, data0.00000 1.00000 2.00000 3.00000 4.00000 5.000006.00000 7.00000 • Try the following: • IDL reads 8 separate data values from the file. When it got to the end of the first row, it went to the second row (rule 4) because more data remained to be read. IDL read to the middle of the second row. • Now rule 5 comes into play. If you read more data, you will start on the third data line, because the rest of the second line is ignored. IDL> vector3 = fltarr(3)IDL> READF, lun, vector3IDL> print, vector312.0000 13.0000 14.0000

  12. Rule #6 • Make every effort to convert data into the data type expected by the variable. • To see what this means, read the 4th and 5th lines into a string array, so that the file pointer is positioned at the sixth data line (starts 33.6000). Try this: IDL> dummy = strarr(2)IDL> READF, lun, dummy • Now suppose you want to read two integer values. IDL makes every effort to convert the data (floating point values in this case) into integers. • See how the floating point values have simply been truncated. IDL> ints = intarr(2)IDL> READF, lun, intsIDL> print, ints 33 77

  13. Rule #7 • Complex data must have a real part and an imaginary part separated by a comma and enclosed in parentheses. If only a single variable is provided, it is considered the real value and the imaginary value is set to 0. IDL> value = complexarr(2)IDL> read, value: (3,4): (4.4,25.5)IDL> print, value( 3.00000, 4.00000)( 4.40000, 25.5000)

  14. Always make sure... • When you are done with a file, make sure that you close it! To do this type: IDL> FREE_LUN, lun

  15. Reading a Free Format File • Lets start by reading in the file we just created “test_free_format.txt” • example: read_free_format.pro • notice that the data variable is a 5x5 array in this case. It was put into the file as a 25-element vector. Reading the data this way is equivalent to reading a 25 element vector and reformatting the vector into a 5x5 array. • Remember that data in IDL is stored in row order.

  16. Writing a column-format data file • It is not uncommon to see data stored in files in columns. It is a good idea to know how to read/write this sort of data in IDL ... but IDL has some surprises for you :) • Example: write_column_ascii.pro • As you can see, the first way does not give us what we want. • Must use a loop to get the desired column format in our output file.

  17. Reading ASCII • 72469 DNR Denver Observations at 12Z 15 Mar 2009 • ----------------------------------------------------------------------------- • PRES HGHT TEMP DWPT RELH MIXR DRCT SKNT THTA THTE THTV • hPa m C C % g/kg deg knot K K K • ----------------------------------------------------------------------------- • 1000.0 44 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 • 925.0 705 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 • 850.0 1413 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 • 830.0 1625 0.6 -15.4 29 1.40 165 4 288.7 293.1 289.0 • 824.0 1683 2.8 -16.2 23 1.32 209 4 291.6 295.8 291.9 • 819.0 1732 6.0 -15.0 21 1.46 247 4 295.5 300.2 295.8 • 816.0 1761 6.6 -15.4 19 1.42 269 4 296.5 301.1 296.7 • 809.2 1829 6.8 -18.0 15 1.15 320 4 297.4 301.2 297.6 • 804.0 1882 7.0 -20.0 13 0.98 319 5 298.2 301.4 298.3 • 785.0 2076 6.0 -19.0 15 1.09 316 10 299.1 302.7 299.3 • 779.3 2134 5.5 -19.0 15 1.10 315 12 299.3 302.9 299.5 • 750.3 2438 3.1 -19.0 18 1.14 300 15 299.9 303.7 300.1 • 726.0 2702 1.0 -19.0 21 1.18 291 12 300.4 304.3 300.6 • 722.2 2743 0.8 -19.9 20 1.10 290 12 300.7 304.3 300.9 • 700.0 2991 -0.3 -25.3 13 0.70 250 9 302.1 304.5 302.2 • 692.0 3083 -0.7 -28.7 10 0.52 244 9 302.7 304.5 302.8 • 643.2 3658 -5.9 -30.1 13 0.49 205 10 303.1 304.9 303.2 filename - “sounding.txt”

  18. sounding.txt - info • File info: • 5 lines of header • missing data filled with -9999 • 11 columns (floating pt. and integers) • 117 rows of data

  19. Read as a “free-format” file • The “quick and dirty” way of reading the file “sounding.txt” into IDL is to treat it as a “free-format” data file, where every element is separated by a space. • See example: “read_free_format_sounding.pro” • treat all the data in the file as floating point data • read all of the data into an 11 x 117 floating point data array • you will have to keep track of which column corresponds with which variable. • This will not work if there are character flags in your data file

  20. Reading a Column-Formatted Data file • What if you want to read the file “sounding.txt” and maintain the variable type of each column, and put the information into different variables (not just one block)? • You could try what his shown in “read_column_wrong.pro” • This will not give an error, but nothing is read into the variables. • Why doesn’t this work? - apparently there is a “rule” in idl that you cannot read into a subscripted variable. • Why not? - IDL passes subscripted variables into IDL procedures like readf by value and not by reference. Data that is passed by value cannot be changed inside of the called routine, since the routine has a copy of the data and not a pointer to the data itself.

  21. Reading a Column-Formatted Data file • One way to solve this problem is to read the data into a temporary variables in the loop. • See example: “read_column_right.pro” • The only time this might be a problem is if you are trying to read a whole lot of data through the loop (remember loops slow things down in IDL). • Remember that loops slow things down in IDL, so it might be better to read all the data in at once into an 11 x 17 array and then pull the vectors out of the larger array using array processing commands. (Again, will not work if there are characters in your data file) • See example: “read_column_other.pro”

  22. Explicit File Format • To read/write an explicit file format, use the same readf and printf commands that were used for free format files, except now the file format is explicitly stated with the format keyword. • Example: “write_format_ascii.pro”

  23. Example - I • Specifies integer data. Print the vector out as a two-digit integers, five numbers on each line, each number separated by two spaces IDL> data = findgen(20)IDL> fmt = '(5(I2,2x),/)'IDL> print, data, format=fmt 0 1 2 3 4 5 6 7 8 910 11 12 13 1415 16 17 18 19

  24. Example - F • Specifies floating point data. Write the data out as floating values with two digits to the right of the decimal place, one value per line. (Be sure to include enough space in the number for the decimal point itself.) IDL> fmt = '(F5.2)'IDL> print, data, format = fmt 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.0010.00etc.

  25. Example - D • Specifies double precision data. Write out five numbers per line, four spaces between each number, in double precision values with 10 digits to the right of the decimal point. IDL> fmt = '(5(D13.10,4x))'IDL> print, data, format=fmt 0.0000000000 1.0000000000 2.0000000000 3.0000000000 4.0000000000 5.0000000000 6.0000000000 7.0000000000 8.0000000000 9.000000000010.0000000000 11.0000000000 12.0000000000 13.0000000000 14.000000000015.0000000000 16.0000000000 17.0000000000 18.0000000000 19.0000000000

  26. Reading/Writing Unformatted Data • Unformatted or binary data is much more compact than formatted data and is often used with large data files. • To read/write binary data use the readu and the writeu commands • Syntax for writing • Syntax for reading: IDL> x = findgen(10000)IDL> openw, lun, 'unformat_data.txt', /get_lunIDL> writeu, lun, xIDL> free_lun, lun IDL> x = findgen(10000)IDL> openr, lun, 'unformat_data.txt', /get_lunIDL> readu, lun, xIDL> free_lun, lun

  27. Reading netCDF files • netCDF (network Common Data Form) is designed to be simple and flexible. • The basic building blocks are: • Variables are scalars or multidimensional arrays (supports string, byte, int, long, float and double) • Attributes contain supplementary information about a single variable or an entire file. Variable attributes include: units, valid range, scaling factor. Global attributes contain information about the file i.e. creation date. Attributes are scalars or 1-D arrays. • Dimensions are long scalars that record the size of one or more variables

  28. What’s in a netCDF File? • use ncdump unix command or • testreadnc.pro • Try on the file “air.mon.mean.nc” • testreadnc prints info to the file “file_info.txt”

  29. Reading a Variable ; open the netcdf file nc = ncdf_open('/Users/rachel/idl_course_week2/air.mon.mean.nc') ; Extract the variables of interestncdf_varget, nc, 'lat', latncdf_varget, nc, 'lon', lonncdf_varget, nc, 'level', level ; close the netcdf file ncdf_close, nc • File is opened with the ncdf_open command, returning the file identifier “nc” • the ncdf_varget function is used to read in the entire contents of the variable • the file was then closed using ncdf_close • See example: read_netcdf.pro

  30. Reading an Attribute ; open the netcdf file nc = ncdf_open('/Users/rachel/idl_course_week2/air.mon.mean.nc') ; Extract the variables of interestncdf_varget, nc, 'air', temperature ; Extract the attributes of interestncdf_attget, nc, 'air', 'add_offset', offsetncdf_attget, nc, 'air', 'scale_factor', scale ; close the netcdf file ncdf_close, nc • In this example, we extract the attributes from the variable “air” called “add_offset” and “scale_factor” using the ncdf_attget command. • See example: read_netcdf.pro

  31. Some netCDF routines

  32. Simple Graphical Displays • The “bread and butter” of scientific analysis is the ability to see your data as simple line plots, contour plots, and surface plots. • The quickest way to visualize data graphically in IDL is to plot to the display window. • This is not the way you will create “publishable” graphics, but it is a good way to start looking at your data and it is a good way to learn how to display your data using the plot command.

  33. A few things you should know .... • IDL supports a system known as “Direct Graphics” which is built to a “device” - oriented model. • What does this mean?? - you select a particular graphics device (screen, printer, PostScript) and then draw graphics on that device. • This also means that you will have to switch the device to change how you visualize data.

  34. Graphics • The set_plot procedure selects a named graphics device. • All subsequent graphics output is sent to the selected graphics device until another set_plot call is made or the IDL session ends • set_plot, device • ‘X’, ‘WIN’, ‘PS’ • Default is the screen display (set_plot, ‘x’) • The following would then display a plot to the screen: IDL> plot, indgen(10)

  35. A few things you might notice ... • IDL’s default setting is to plot things with a black background using white font. This can be quite annoying at times and I will teach you how to change that in a bit. • The window is labeled “IDL 0” • We plotted 1 vector, but got an x vs. y plot.

  36. Window Options • Graphics window can be created directly using the window command or indirectly by issuing a graphics display command when no window is open. IDL> window • The title bar of the window has a number in it (0 in this case) - this is the window’s graphics window number index. • You can open a new window with a graphics window index number of 1 by doing the following: IDL> window, 1 • You are allowed up to 128 different windows (but boy that would be confusing) • IDL will assign window index numbers for windows 32-127 buy using the /free keyword: IDL> window, /free • If you don’t explicitly open a new window, IDL will plot over whatever is on the current window.

  37. Window options • There are other options for setting, deleting, and clearing windows. Try the following: • Open two windows (indexed 1 and 2) • Plot something in the current window (window 2) • Plot something in window 1 (using wset) • delete window 2 • erase the contents of window 1 IDL> window, 1IDL> window, 2IDL> plot, indgen(10)IDL> wset, 1IDL> plot, sin(findgen(200))IDL> wdelete, 2IDL> erase, 1

  38. Window Options • You can also change the position and size of a graphics window. • Use xsize and ysize to change the size of the window (these are in pixels) • Use xpos and ypos to relocate the position of the graphics window. Windows are positioned on the display with respect to the upper left-hand corner of the display in pixel or device coordinates. IDL> window, 1, xsize=200, ysize = 300 IDL> window, 2, xpos=75, ypos=150

  39. Creating Plots • The plot procedure draws graphs of vector arguments. It can be used to create line graphs, scatter plots, barplots, and even polar plots. • The plot command can be modified with over 50 different keywords!

  40. Plotting a vector • IDL will try to plot as nice a plot as it possibly can with as little information as it has. • Try plotting: IDL> plot, sin(findgen(100)*.2) • In this case the x or horizontal axis is labeled from 0-100, corresponding to the number of elements in the vector. The y or vertical axis is labeled with the data coordinates (this is the dependent data axis) • For the following see: example_line_plot.pro, and example_line_plot2.pro

  41. Plotting x vs. y • Most of the time a line plot displays one data set (the independent data) plotted against a second data set (the dependent data). • For example: • x represents radian values extending from 0 - pi and y is the sine (x) IDL> x = FINDGEN(21)*.1*!piIDL> y = sin(x)IDL> plot, x, y

  42. Customize Graphics Plots • To label the x-axis, y-axis, and title use: xtitle, ytitle, and title keywords, respectively. • By default, the plot title is 1.25 times larger than the x and y axes labels - there are a number of ways to change this: • To change the size of all the plot annotations use the charsize keyword. • To change the character size of each individual axis use [XYZ]charsize. IDL> plot, x, y, xtitle = 'Radians', ytitle = 'Radians', $ title = 'Sine Wave' IDL> plot, x, y, xtitle = 'Radians', ytitle = 'Radians', $ title = 'Sine Wave', charsize = 1.5 IDL> plot, x, y, xtitle = 'Radians', ytitle = 'Radians', $ title = 'Sine Wave', xcharsize = 2, ycharsize = 3

  43. Modify line styles and thickness • Use the linestyle keyword to plot the data with a different line style. For example to get a dashed line (instead of a solid line) use: IDL> plot, x, y, linestyle = 2 • Use the thick keyword to change the thickness of the line plots. For example if you want to plot displayed with a dashed line that is three times thicker than normal try: IDL> plot, x, y, linestyle = 2, thick = 3

  44. Symbols • You can also plot your data using symbols instead of lines. Like the linestyle keyword, similar index numbers exist to allow you to choose different symbols. • For example you can draw the plot with asterisks by setting the psym keyword to 2: • You can also connect your plot symbols with lines by using negative values for the psym keyword. To plot triangles connected by a solid line try: IDL> plot, x, y, psym = 2 IDL> plot, x, y, psym = -5

  45. Plot Style and Range • You can also limit the amount of data you plot with keywords. To just plot data that lies between 1 and 3 on the x axis and -0.5 and 0.5 on the y axis try: • You can change the way your plot looks by using the [XYZ]style keywords. For example to force an exact axis range use: IDL> plot, x, y, xrange = [1,3], yrange = [-0.5,0.5] • Sometimes IDL is lame and won’t like your chosen axis range (because the chosen range is not “aesthetically” pleasing), use the [XYZ]style keyword to make IDL listen to you! IDL> plot, x, y, xstyle = 1

  46. Adding Lines to graphics • Use plots to add lines to your plots. • syntax: plots, [x0,x1],[y0,y1],[z0,z1] • Example: add a line that spans the length of the x-axis and crosses through 0 on the y-axis. • plot, x, y, xrange = [0, 2*!pi], xstyle =1 • plots, [0, 2*!pi],[0,0]

  47. Tick marks, intervals, and names • [XYZ]ticklen - controls the length of the axis tick marks (expressed as a fraction of the window size). Default is 0.02. ticklen =1.0 produces a grid, negative ticklen makes marks that extend outside the window. • [XYZ]tickinterval - set to a scalar to indicate the interval between major tick marks • [XYZ]tickname - a string of up to 30 elements that controls the annotation of each tick mark.

  48. Plotting Multiple Data Sets • Use the oplot command to plot multiple data sets on the same set of axes. IDL> x = findgen(21)*.1*!piIDL> y = sin(x)IDL> y2 = cos(x) IDL> plot, x, y, IDL> oplot, x, y2, linestyle = 2 • What if you have two data sets that require different axes?

  49. Positioning • You can also position the plot inside the window using the position keyword. • position is a 4-element vector giving, in order, the coordinates [(x0,y0),(x1,y1)] of the lower left and upper right corners of the data window. Coordinates are expressed in normalized units ranging from 0.0 to 1.0. Position keyword is never specified in data units. IDL> x = findgen(21)*.1*!piIDL> y = sin(x) IDL> plot, x, y, position = [0.2, 0.2, 0.8,0.8]

  50. Plotting with Multiple Axes • Sometimes you have two or more data sets on the same line plots, but you want the data sets to use different y axes. It is easy to establish as many axes as you need with the axis command. • The key to using the axis command is to use the /save keyword to save the proper plotting parameters. • Try this: (*note the /ylog keyword is set to make the new axis have a log-scale IDL> x = findgen(21)*.1*!piIDL> y = sin(x)IDL> y2 = dindgen(21)^10 IDL> plot, x, y, position = [0.2,0.2,0.8,0.8],$ xtitle = 'x', ytitle = 'y' IDL> axis, yaxis=1, yrange = [.001,5E9],/save, $ ytitle = 'other axis',/ylog

More Related