1 / 69

GDT Tips and Tricks

GDT Tips and Tricks. File Handling. The Indexed File Structure The BINARY Tree What happens during a Read and Write operation? What happens to the Index? How do we obtain data by the Index? The Impact of Compressing your keys How is Data File Integrity maintained

quana
Télécharger la présentation

GDT Tips and Tricks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GDT Tips and Tricks

  2. File Handling • The Indexed File Structure • The BINARY Tree • What happens during a Read and Write operation? • What happens to the Index? • How do we obtain data by the Index? • The Impact of Compressing your keys • How is Data File Integrity maintained • Causes and Response to File Corruption • Enhancing Performance!

  3. Question to Ponder? • What Level of Data Integrity do you require in your files? • Maybe you want to immediately flush any write operations immediately to disk?? • Maybe you want a reasonable level of integrity where you let Micro Focus write the data as soon as it can to protect against application being killed etc..?? • Maybe you are comfortable and have the luxury of just re-running your applications to recover from an untimely event??

  4. Indexed File Structure • The basics • Your indexed file will have a primary key and possibly alternate keys that interface to your data. • The data will include live records as well as deleted records.

  5. Index Structure

  6. Index Structure

  7. The Binary Tree (Read a Key)

  8. The Binary Tree (Read a Key)

  9. The Binary Tree (Read a Key)

  10. The Binary Tree (Read a Key)

  11. The “Binary Chop”

  12. The “Binary Chop”

  13. The “Binary Chop”

  14. The “Binary Chop”

  15. The Binary Tree (Read a Key) • If you do another Random Read of another Key? • It would start at the beginning Node and work it’s way back down the chain UNLESS • If previous READ is in cache, then it can read the nodes from cache. • ALTERNATIVELY, if you are doing a sequential READ NEXT, it knows via cache the previous read Node and starts from there (much quicker).

  16. Key Compression • Key Compression – to save space • Types of Key Compression • Duplicate Key Compression • Maybe used when you have many keys the same • Shows the first instance of the key while all other occurrences have a pointer to the node it should point to • Leading Character Compression • If 1 record key contains AAAAA and the second record key contains AAAAB, then the second record key will only show “B”, the A’s are compressed. The key does however contain information required so key can be decompressed. • Trailing Space Compression • Spaces at end of key are compressed. Again information maintained for decompression. • Trailing Null Compression • Null’s at end of key are compressed. Again information maintained for decompression.

  17. Key Compression • What happens when you try to read with Key Compression? • Keys are not fixed length (some compressed more than others) • So, the keys need to be decompressed before they can be read and compared to the key being looked for • The “Binary Chop” cannot happen • MUST SEQUENTIALLY WALK THROUGH EVERY NODE!

  18. Indexed Files (Writing Records) • What is happening? • Every index in the file needs to be updated (Primary and Alternate Keys) • The basic process: • The Header is updated – just to say we are in mid-update • The Record is added to the Data file • Indexes are Updated – 1st the Primary then the Alternate keys • The Header is updated – to say that the action is completed

  19. Indexed Files (Writing Records)

  20. Indexed Files (Writing Records)

  21. Indexed Files (Writing Records)“NODE SPLITTING” • Done to have the available room to add the entry to the node. • Must look at the preceding node to verify that it also has available room to add the entry.

  22. Indexed Files (Writing Records)“NODE SPLITTING”

  23. Indexed Files (Writing Records)“NODE SPLITTING”

  24. How File Integrity is Maintained • The File Header • Static Information • File Attributes • Number of Keys • Format and Organization of a file • Dynamic Information • Integrity indicator • Modification Counter • Logical EOF marker

  25. How File Integrity is MaintainedThe File Header

  26. How File Integrity is MaintainedThe File Header • Integrity Flag • The File Handler uses this flag to maintain integrity • 2 Byte Field • Value depends on the update being performed (type of operation) • A non zero value when header is read indicates to the File Handler that an operation is not fully completed.

  27. How File Integrity is MaintainedThe File Header • Modified Value Field • Position 105-108 • 4 byte field • Used as an aid to performance • If a process detects that this value has changed after the last read of the header, this indicates to the process that nodes cached are invalid and must read new nodes from the indexed structure that are physically stored.

  28. Understanding the Write Operation • The File Handler obtains a “Write Semaphore” • To only allow 1 process to update at a time • Control-Break is disabled • The File Header is read • The Integrity Flag is updated • To insure that another process has not left the file in a corrupt status and also checks the Modified Value flag • Update and Write the File Header basicall stating that the process is performing a write operation. • Write DATA • Record create is written to disk • Write INDEX • Index is created and written to disk • INTEGRITY FLAG is reset and written back to disk for another process to update the file • FILE HEADER is written • SEMAPHORE is RELEASED

  29. Understanding the Write Operation • Special Note • When using a WRITE / OPEN EXCLUSIVE on a file, the indexes are CACHED until either the CACHE limit is reached or a CLOSE operation is done.

  30. Possible Causes of Corruption of Indexed Files • KILL -9 is used on Unix • Need to use KILL -15 • RTS invokes Micro Focus Exit procedure flushing back to disk the cached indexed nodes • Copying open files on Unix. • Unix allows the copying of opened Exclusive files which at the time of copying, the indexed nodes cached may not be flushed • Network problems • Machine rebooted or powered off while indexed files are opened • Actual error in system itself

  31. Fixing Corrupted Files • REBUILD UTILITY • Taking the attributes of the input file to produce the output file • Requires Exclusive use of the file • TO REORGANIZE A FILE • USES THE INDEX TO READ THE DATA RECORDS • REBUILD INFILE,OUTFILE • TO FIX CORRUPTED INDEXES • IGNORES INDEXES AND DIRECTLY READS RECORDS FROM THE DATA FILE • NEW DATA STRUCTURE AND DATA FILE CREATED • REBUILD INFILE,OUTFILE /d • TO RUN REBUILD ON LIVE DATA • REBUILD INFILE

  32. Summary on File Integrity • The MOST Integrity? • WRITETHRU directive • Can be used at compile time or as a tunable to the File Handler • When OPEN is done on the file, specifies to the Operating System that any WRITES to a file are flushed immediately to disk. • PERFORMANCE takes a NOSE DIVE immediately! • Default Level of Integrity? • Reasonable level of Integrity • Micro Focus will write data as soon as possible • Protects against application being killed • Couple of directives to look at • IDXDATBUF • The IDXDATBUF option determines the size of buffer used when accessing the data portion of a file with organization INDEXED. • DEFINE NBBUF & BPB from JCL will overwrite this setting if given. • LOADONTOHEAP • The LOADONTOHEAP option specifies whether the File Handler loads the file into memory before executing any I/O operations. • Able to RERUN your applications • Maybe recovery enough

  33. File Handling Performance • Getting better performance from your application by getting better performance from your file handler • Advances in technology, cpu speed, amount of memory addressable by a process and code generator optimization has made it easier to push back thoughts of trying to improve performance • Accessing your disk is still the slowest thing you can do to your machines today

  34. File Handling Performance • You can make certain aspects of the file handler perform better but you need to be careful on how this can effect another application accessing the file in a different manner. • Micro Focus provides an “All Round” solution to performance • Giving you the ability to tune the file handler to what you need in your application performance!

  35. File Handling Performance • The BIG question…what should you use? • Understanding what to use is based on your understanding of the Binary Tree. • 1 for every index of your data file

  36. Tuning Your Files • Access Permission • Examine your data files on individual per file basis • Not every file needs complete access to everyone • When Opening Files: • Use Exclusive access where possible • Otherwise allow Only Readers • Only when absolutely necessary, give all others complete access to the file • Micro Focus Timing (8 million records read on IDX 8 format file) • Exclusive access - 7 min 13 sec • Allow Readers – 25 min • Allow all Others – 25 min

  37. Tuning Your Files • Based on the findings below, you may want to just say Allow all Others if choice between that and Allow Readers, but this was because only 1 user was used in the test. • Micro Focus Timing (8 million records written on IDX 8 format file) • Exclusive access - 7 min 13 sec • Allow Readers – 25 min • Allow all Others – 25 min

  38. Tuning Your Files • Write Allowing Readers • When update is done, goes to disk to allow others to read • Write Allowing Others • Other updates by other processes may be done at same time. • When Writing • Nodes are Cached into memory • With Exclusive use, only has to check if nodes have been changed. Quick • With allowing others, keeps reading nodes off of disk as they are changed. A lot slower.

  39. Tuning Your Files • Micro Focus Timing • (8 million records read on IDX 8 format file) • Exclusive access - 2 min 32 sec • Allow Readers – 2 min 32 sec • Allow all Others – 6 min 28 sec • This allows applications to change the file. It has a lot more checks to do to see if the file has been changed each time.

  40. File Handler Configuration Settings • READSEMA • Specifies whether or not the system attempts to gain a semaphore for shared files when operations are performed that do not modify the file. (READ, START etc..) • You need to ensure that is set to OFF (default) • You might think that this can cause dirty reads? • No, when you read a record it checks to see if the record has been changed, if yes, then it takes out a semaphore on that record. • When set on you can degrade the performance by 15%

  41. File Handler Configuration Settings • IGNORELOCK • Not interested if you have a “dirty” read. • Not bothered that someone comes in and changes a record you have just read. • Can improve performance by 15% • Take care that this is handled internally by GDT thru READLOCK=STAT directive.

  42. KEYS • The shorter the keys are in size, the better. • Fit more in a node • Quicker to traverse the Binary Tree • Remove redundant keys • Each key has a tree • Needs to be updated for each insert and delete • Micro Focus Timing • 8 million records Read • 2 alternate keys – 6 min 28 sec • 3 alternate keys – 8 min 40 sec • 4 alternate keys – 53 min 08 sec ! (we will talk about this in a couple of minutes)

  43. Compression • Data Compression • Minimal Performance hit • When reading a record, it will traverse the tree, every time it gets a record, it has to decompress the record before writing. • Index Compression • If you can get away with it, do not use it! • Always a hit in performance – sometimes severe! • File handler cannot Binary Chop the node when searching for the key. • All keys are different size • Cannot tell where middle of the node is

  44. Compression • Micro Focus Timing • 8 million records • Sequential Write No data/key compression – 6 min 40 sec • Random Read No data/key compression – 6 min 28 sec • Sequential Read No data/key compression – 6 min 12 sec • Sequential Write with Data compression – 8 min 14 sec • Random Read with Data compression – 7 min 59 sec • Sequential Read with Data compression – 7 min 53 sec • Sequential Write w/ Data/Key compression – 15 min 18 sec • Random Read w/ Data/Key compression – 15 min 36 sec • Sequential Read w/ Data/Key compression – 7 min 50 sec • SEQUENTIAL READ – Consistent. Doing a read next it will always know where previous key is. Much different than random reads.

  45. FINE TUNING • Setting File Handler Configuration Options • Set on per file basis • There is no magic formula. • You need to adjust to suit each application • Can have both positive and negative impact on your application • SET EXTFH=C:\EXTFH.CFG • [XFH-DEFAULT] • NODESIZE=4096 • [FILE1.DAT] • NODESIZE=1024 • [FILE2.DAT] • INDEXCOUNT=32

  46. INDEXCOUNT • Specifies number of index nodes to be cached for an index file per process • Default cache size is 16 nodes • [XFH-DEFAULT] • INDEXCOUNT=32 • take care that this is handled internally by GDT thru NBBUF & BPB directives. • [FILE1.DAT] • INDEXCOUNT=16

  47. INDEXCOUNT IN ACTIONINDEXCOUNT = 4

  48. INDEXCOUNT IN ACTIONINDEXCOUNT = 4

  49. INDEXCOUNT IN ACTIONINDEXCOUNT = 4

  50. INDEXCOUNT IN ACTIONINDEXCOUNT = 4

More Related