610 likes | 739 Vues
This guide emphasizes the critical nature of learning SQL, especially in the healthcare sector where data is abundant and complex. It details the differences between SQL platforms and highlights key relational database concepts. Readers will learn how to apply SQL syntax through real-life examples, particularly using hierarchical databases and relational structures to manage healthcare data efficiently. By understanding SQL, healthcare professionals can harness vast data repositories for improved analytics and reporting, ensuring informed decision-making and secure data access.
E N D
Geoff Kimber SQL Reporting
Objectives • Realize the importance of learning SQL • Recognize SQL platform differences • Identify relational database concepts • Apply SQL syntax to a real-life example • Review SQL references
Hierarchical Databases • Data organized in tree structure with parent and child segments • Parent to child is a one to many relationship • Implies repeating information, generally in child segments • Fileman • Prescription file with refill multiple field
Relational Databases • Data consists of records stored in tables • Each row is unique • Column values are of the same data type • Sequence of rows and columns is insignificant
What is SQL? • Structured Query Language [SQL] • Is an ANSI computer language that is used to interact with many relational databases • Oracle • MS-SQL Server • MySQL, PostgreSQL and others • SQL is a set-based language • SQL is structured like English • SEQUEL (Simple English QUEry Language)
Why Learn SQL? • VA is truly data rich • Each site has between 2-4 TB of data† • non-imaging data • VA FileManager [FileMan] • Only within VistA • Limited functionality for non-programmers • Runs in the production environment • Workflow and resource concerns † - Pham R. The VA Data Lifecycle (Internals, Data Flows, and Business Intelligence)
Why Learn SQL? • Data repositories (warehouses) • Moves the analysis off-line • Provides accessible, yet very secure, data • ODBC, ADO and LINQ • Greater functionality and access control • The “official” data is secured within VistA • Independent of VistA • Microsoft Active Directory based access • Maintain the front-line stance of CPRS/VistA
Why Learn SQL? • Various VA relational databases • Corporate Data Warehouse (CDW) • Regional Data Warehouse (RDW) • VISN Data Warehouse (VDW) • Potential for local data warehouses • Extract raw data from VistA off-hours • UPSERT the data into an SQL server automatically • Users run their own reports and access is secure
Why Learn SQL? • Data warehouses = relational databases • SQL = the language of relational databases • Relational databases are everywhere • Learn a portable skill • Take a systems approach to collaboration • Build reusable code • Work more efficiently
Why SQL Server? • VA has an enterprise license • Microsoft SQL Server is functionally superior to Microsoft Access • Secure • Backups • Procedural code • No database size limit (software) • Efficiency
Microsoft Access & SQL Server • Two of the different SQL flavors • Transact SQL (Microsoft SQL Server) • Manage databases/services, procedural code • Jet SQL (Microsoft Access) • Information retrieval, program component • Each program has its own SQL flavor • Core keywords and functions follow standards • ANSI guidelines are “considered”
A Set-Based Language • In SQL: • Something is either part of a set or it is not • Order is meaningless • SQL Server supports procedural code • VBA can be used in MS Access for procedures
Corporate data warehouse structure • Dimension tables • Fact tables
Dimension tables • Contain attributes that describe records in fact tables • Each record contains unique identifier • Generally more stable • Drug name, provider name, patient name and address • Similar to Drug file (#50), New Person file (#200), Patient file (#2)
Fact tables • Capture operational data • Contain multiple unique identifiers • One identifier for each record • One identifier for each data element in dimension tables • Similar to Fileman IEN
Fact tables • Usually contain numerical and date fields • Prescription fills, Outpatient visits, lab test results • Similar to prescription file (#52), V POV (#9000010.07)
Dimension table examples • Dim.LocalDrug • LocalDrugSID • Sta3n (3 digit station number) • LocalDrugNameWithDose • VaClassification • PricePerDispenseUnit • NationalDrugSID
Dimension table examples • Dim.NationalDrug • NationalDrugSID • DrugNameWithoutDose • DosageForm • Strength
Dimension table examples • sPatient.sPatient • PatientSID • PatientName • PatientSSN • AddressLine1 • City • State • ZipCode
Fact table examples • RxOut.RxOutpat • RxOutpatSID • PatientSID • ProviderSID • LocalDrugSID • IssueDate • RxNumber
Fact table examples • Rxout.RxoutpatFill • RxoutpatFillSID • RxOutpatSID • FillType • ReleaseDateTime • Qty • DaysSupply • UnitPrice
CDW Tables • Metadata • Information about data • Includes CDW to VistA field crosswalk • http://vaww.dwh.cdw.portal.va.gov/metadata/default.aspx
Common SQL Clauses • Clauses perform set-based calculations
Common SQL Functions Microsoft Access supports the aggregate functions FIRST and LASTSQL Server has a COUNT(DISTINCT...) aggregate function
Common SQL Operators Access uses “*” as a wildcard and double quotes; SQL Server uses “%” and single quotesAccess uses the ampersand to concatenate “&”, SQL Server the plus sign “+”
Code example SelectLocalDrugSID ,NationalDrugSID ,LocalDrugNameWithDose fromvdw.dim.localdrug WhereLocalDrugNameWithDose like '%atenolol%'
Code example • Case is not normally important, but helps with readability • Spaces, tabs, and hard returns have limited importance • Important in text fields • Not important between commands • Help improve readability • Microsoft query analyzer color codes key words for readability
Code example • SQL databases have many different data types • Varchar • Int • Bigint • Smallint • Decimal • Float • Datetime • Money
Code example • Select • Main command used to retrieve data from SQL tables • Followed by list of fields to return • ‘*’ (without quotes) returns all fields in table • Normally, avoid using ‘*’ unless you need all the fields in a table
Code example • From vdw.dim.localdrug • Indicates source of data from which you are selecting • Vdw.dim.localdrug • VDW is database name • VDW => VISN Data Warehouse • RDW => Regional Data Warehouse • CDW => Corporate Data Warehouse • Dim is schema • Localdrug is table name
Code example • Where • Allows you to restrict or filter the table contents to just the results you want • Extremely versatile • Like • ‘%’ is a multicharacter wildcard • ‘%atenolol%’ - contains ‘atenolol’ • ‘atenolol%’ – begins with ‘atenolol’ • ‘%atenolol’ – ends with ‘atenolol’
Code example • Case is not normally important, but helps with readability • Spaces, tabs, and hard returns have limited importance • Important in text fields • Not important between commands • Help improve readability • Microsoft query analyzer color codes key words for readability
Code example SelectLocalDrugSID ,NationalDrugSID ,LocalDrugNameWithDose fromvdw.dim.localdrug Wherelocaldrugnamewithdose like '%atenolol%'
Fileman equivalent Select VA FileMan Option: PRINT File Entries OUTPUT FROM WHAT FILE: DRUG// SORT BY: GENERIC NAME// 'GENERIC NAME["ATENOLOL" WITHIN GENERIC NAME["ATENOLOL", SORT BY: FIRST PRINT FIELD: NUMBER THEN PRINT FIELD: PSNDF:NUMBER THEN PRINT FIELD: GENERIC NAME THEN PRINT FIELD:
Joins • Used to connect two or more tables together using data elements that are shared by individual tables • Similar to fileman jumps, with more power • Inner joins, outer joins and full joins
Inner Joins • Produces a result that includes only the records that are the same in both tables
Inner joins – Example Selecta.localdrugnamewithdose , b.strength fromvdw.dim.localdrug as a inner join vdw.dim.nationaldrug b ona.NationalDrugSID = b.NationalDrugSID whereLocalDrugNameWithDose like ‘%atenolol%’
Outer joins • Results contain all records in one table and only matching records in another table • Left joins – return all results in table specified in ‘from’ statement and all matching tables from ‘joined’ table • Right joins – return all results in table specified in ‘joined’ table ‘from’ statement and all matching tables from ‘from’ statement
Outer joins – example Selecta.localdrugnamewithdose , b.strength fromvdw.dim.localdrug as a left outer join vdw.dim.nationaldrug b ona.NationalDrugSID = b.NationalDrugSID whereLocalDrugNameWithDose like ‘%atenolol%’