Accelerate Data Movement with High-Performance Loader (HPL) for Informix
Discover how the High-Performance Loader (HPL) can transform your data loading and unloading process in Informix. HPL offers exceptional speeds—with rates up to 11.5 GB/CPU/hr—by utilizing multi-threading and avoiding traditional buffering methods, making it faster and more flexible than conventional loaders. Learn about key functionalities, such as loading from various sources and customizing processes to meet specific needs. With HPL, you can achieve significant performance improvements while utilizing existing resources. Start optimizing your data processes today!
Accelerate Data Movement with High-Performance Loader (HPL) for Informix
E N D
Presentation Transcript
Jack Parker Arten Technology Group
Calisthenics • How many here are Informix users? • How many here are HPL users?
Why use HPL • It’s Fast • It’s Flexible
Previous (and still extant) Loaders • LOAD • 1GB/Hr. 2GB Limit. Long Transaction heaven. • dbload • 1GB/Hr. 2GB Limit. • dbimport • 1GB/Hr. 2GB Limit. Entire database • Insert Cursor • 2GB/Hr. 2GB Limit. Complex. • High Performance Loader – Faster. More flexible. • See Load FAQ at www.artentech.com/downloads.htm
Why is it faster? • Other loaders • Single threaded • Use Buffers (the Wall) • HPL • Multi-threaded • Avoids Buffers • Light Scans / Light Appends
Light Scans and Light Appends • Light Scans use their own buffer pool and avoid the normal buffers • Light Appends write to their own disk and append the results if successful • Light scans occur when: • Table is bigger than buffers • Dirty Read • No Varchars
In reality… • HPL performance is dependent not only on CPU, but on disk I/O • The more disk you can put into the equation, the faster it will run. • My tests were run on a • Solaris 2 CPU machine • 1 disk • 31MB of RAM for Informix
More disk • Spread input/output files across filesystems (create three or four large data areas just for this). • Fragment tables across multiple dbspaces. (up to 3 per CPU, but at least 100MB/dbspace) • Size tables properly
Unload Metrics • Simple unload 2.4 GB/CPU/Hr • Wait a minute • Update statistics • Simple unload 4.9 GB/CPU/Hr
Flexible • Load or Unload • Disk • Tape • Pipe
This means that you can… • Load straight from (or to) disk • Prepare data to a pipe and Load • Unload to gzip (pipe) • Load from gzip (pipe) • Unload to another HPL load process • Or Pload • Even on another machine
Unload to Gzip Load from Gzip Internal format to Gzip Internal from Gzip HPL “Insert into table2 select * from table1” (vs. normal insert/select) 6 GB/CPU/HR 3 GB/CPU/HR 11.5 GB/CPU/HR 7.5 GB/CPU/HR 4.5 GB/CPU/HR .7 GB/CPU/HR A quick look at some of those:
You can use HPL for all sorts of things it wasn’t intended for • As a backup tool • As a reorg tool • As a data movement/migration tool • As an ETL tool • For more thoughts on this check out Raj Murali’s talk on ETL within the database (XPS) D31 • also see the IBM Developer zone DSS Application Processing http://www7b.boulder.ibm.com/dmdd/zones/informix/library/techarticle/parker/0502parker.html
Flexible on the inside too. • A “Natural” interface • ipload to start • $DISPLAY • xhost • INFORMIXSERVER=tcp connection (for some)
Format Specification
Query Specification
Other • $DBONPLOAD • $PLCONFIG • Create your own functions and link them in • onpladm
Summary The High Performance Loader is a powerful tool which can be used wherever you need to move a lot of data around efficiently. With this tool you can replace a lot of old and slow code and realize significant performance improvements. Best of all? It’s already on your machine.