GIS

1. GIS / Hydraulic Model Integration2008 ESRI UC Will Allender, GISP Planning and Engineering Asset Systems Planning August 6, 2008 Thank you for the introduction and good afternoon everyone. My name is Will Allender and I am in the Asset Systems Planning group at Colorado Springs Utilities. This afternoon I would like to talk on the subject of GIS and hydraulic model integration with specific information about what we are doing at Colorado Springs Utilities. I would also like to start off by mentioning that the purpose of this presentation is actually 2-fold: First, it is a Practical real-world application of how GIS and modeling are being integrated into a production, enterprise system in the utility industry and I hope the information presented will benefit those of you in the industry who may be attempting a similar project. In addition, this presentation also serves as the conclusion to my Academic pursuit for a Master�s Degree in GIS with Penn State�s online World Campus, so as I go along, I may be making some references strictly for the benefit of Penn State faculty and students who may be here today.Thank you for the introduction and good afternoon everyone. My name is Will Allender and I am in the Asset Systems Planning group at Colorado Springs Utilities. This afternoon I would like to talk on the subject of GIS and hydraulic model integration with specific information about what we are doing at Colorado Springs Utilities. I would also like to start off by mentioning that the purpose of this presentation is actually 2-fold: First, it is a Practical real-world application of how GIS and modeling are being integrated into a production, enterprise system in the utility industry and I hope the information presented will benefit those of you in the industry who may be attempting a similar project. In addition, this presentation also serves as the conclusion to my Academic pursuit for a Master�s Degree in GIS with Penn State�s online World Campus, so as I go along, I may be making some references strictly for the benefit of Penn State faculty and students who may be here today.

2. 2 Agenda Overview of Colorado Springs Utilities Goals and objectives Modeling types Data requirements for modeling System comparisons Data and process flow Lessons learned and next steps For an agenda, first I would like to give you an overview of Colorado Springs Utilities and a brief history of our GIS and hydraulic modeling efforts. Then, I�d like to lay out the specific goals and objectives and the scope of the specific project that I worked on. I am going to make the assumption that most of you here are familiar with GIS, after all, this is a GIS conference and we�re already more than halfway through the week. However, some of you may not be all that familiar with hydraulic modeling. So I want to spend a little bit of time talking about this domain and introducing some basic model types. Then, I�d like to get into the real heart of my project and talk about the data requirements of each system, and how the GIS and the existing hydraulic model were compared to each other and how that impacts the data and process flow as we move toward an integrated system. Finally, I will end with some lessons learned and next steps, and reflect a little bit on how my project went.For an agenda, first I would like to give you an overview of Colorado Springs Utilities and a brief history of our GIS and hydraulic modeling efforts. Then, I�d like to lay out the specific goals and objectives and the scope of the specific project that I worked on. I am going to make the assumption that most of you here are familiar with GIS, after all, this is a GIS conference and we�re already more than halfway through the week. However, some of you may not be all that familiar with hydraulic modeling. So I want to spend a little bit of time talking about this domain and introducing some basic model types. Then, I�d like to get into the real heart of my project and talk about the data requirements of each system, and how the GIS and the existing hydraulic model were compared to each other and how that impacts the data and process flow as we move toward an integrated system. Finally, I will end with some lessons learned and next steps, and reflect a little bit on how my project went.

3. 3 Overview and background 4 service utility - 500 square miles 122,000 water meters � 380,000 population 208,000 water mains � 2000 miles 22,000 hydrants � 66,000 valves Water � Raw, potable and non-potable AutoCAD-based model (H2ONet) ? InfoWater ESRI/Librarian GIS ? SDE geodatabase Colorado Springs Utilities is a 4-service utility serving electric, gas, water and wastewater to Colorado Springs and the Pikes Peak region of Colorado. If we look at just our water system, we serve about 122,000 homes and businesses. That�s about 380,000 people. Within our water infrastructure, we have about 208,000 individual water pipes, or mains, totaling about 2000 miles of pipe. We have roughly 22,000 fire hydrants and 66,000 valves and we keep track of our water system from the snowmelt in high mountain reservoirs, where the water is called raw water, all the way down to our treatment facilities in Colorado Springs, where the water is treated and becomes potable drinking water, which is then distributed out to our customers. We also have a non-potable system, and this may include things like untreated groundwater which might be used for golf course irrigation, for example. Our hydraulic model and our GIS have been in production since the late 80�s and early 90�s. However, they unfortunately grew up independent of each other. Our hydraulic model began in an AutoCAD-based system called H20Net, and this was finally migrated over to a personal geodatabase model called InfoWater, in 2005. Our GIS originated in ESRI�s librarian coverages, and this was migrated over to an enterprise geodatabase, and utilizes Oracle and SDE. This was also done in 2005. Now that both systems are stored in a geodatabase, the stage is set for integration of the two systems.Colorado Springs Utilities is a 4-service utility serving electric, gas, water and wastewater to Colorado Springs and the Pikes Peak region of Colorado. If we look at just our water system, we serve about 122,000 homes and businesses. That�s about 380,000 people. Within our water infrastructure, we have about 208,000 individual water pipes, or mains, totaling about 2000 miles of pipe. We have roughly 22,000 fire hydrants and 66,000 valves and we keep track of our water system from the snowmelt in high mountain reservoirs, where the water is called raw water, all the way down to our treatment facilities in Colorado Springs, where the water is treated and becomes potable drinking water, which is then distributed out to our customers. We also have a non-potable system, and this may include things like untreated groundwater which might be used for golf course irrigation, for example. Our hydraulic model and our GIS have been in production since the late 80�s and early 90�s. However, they unfortunately grew up independent of each other. Our hydraulic model began in an AutoCAD-based system called H20Net, and this was finally migrated over to a personal geodatabase model called InfoWater, in 2005. Our GIS originated in ESRI�s librarian coverages, and this was migrated over to an enterprise geodatabase, and utilizes Oracle and SDE. This was also done in 2005. Now that both systems are stored in a geodatabase, the stage is set for integration of the two systems.

4. 4 Goals and objectives Overarching goal - Integrate GIS with InfoWater Define target level of integration Inter-system data element mapping Inter-system data element comparison Analyze data quality between systems Establish data improvement processes Reporting of quality improvements The overall, long-term project goal for Colorado Springs Utilities is to perform an integration between GIS and the hydraulic model. However, being a combined work and school project that needed to fit within a college quarter, this project is really more of a kick-off, focused on developing a sound workplan, design and performing the necessary data QA that will eventually lead to the two systems being integrated. With that in mind, the specific goals of this project include: defining the target level of integration. Are we talking about interchange files or some sort of staged interface tables? Or are we talking about full systems integration, where modeling takes place inside of the enterprise GIS, and the enterprise GIS fully exposes all aspects of the model? For any integration effort, we�ll need to establish the data mapping between the systems and then perform some qualitative comparisons on those cross-mapped attributes. As I said earlier, these 2 systems grew up independent of one another, so for them to become integrated, the data from one system will need to be of sufficient quality that it will be allowed to overwrite the same attributes in the other system, regardless of whether the integration is just an interface or if its a full integration. Currently, the data owners of these 2 independent systems think that their data is the best source. So, we need to find all of the places where the two systems don�t agree and perform the proper research to make data improvements. This data cleanup effort will also need a reporting mechanism so that we can monitor our data cleanup and predict a target date for actually overwriting one system with the other.The overall, long-term project goal for Colorado Springs Utilities is to perform an integration between GIS and the hydraulic model. However, being a combined work and school project that needed to fit within a college quarter, this project is really more of a kick-off, focused on developing a sound workplan, design and performing the necessary data QA that will eventually lead to the two systems being integrated. With that in mind, the specific goals of this project include: defining the target level of integration. Are we talking about interchange files or some sort of staged interface tables? Or are we talking about full systems integration, where modeling takes place inside of the enterprise GIS, and the enterprise GIS fully exposes all aspects of the model? For any integration effort, we�ll need to establish the data mapping between the systems and then perform some qualitative comparisons on those cross-mapped attributes. As I said earlier, these 2 systems grew up independent of one another, so for them to become integrated, the data from one system will need to be of sufficient quality that it will be allowed to overwrite the same attributes in the other system, regardless of whether the integration is just an interface or if its a full integration. Currently, the data owners of these 2 independent systems think that their data is the best source. So, we need to find all of the places where the two systems don�t agree and perform the proper research to make data improvements. This data cleanup effort will also need a reporting mechanism so that we can monitor our data cleanup and predict a target date for actually overwriting one system with the other.

5. 5 Goals and objectives Why integrate at all? Eliminate manual data entry � gain efficiencies Cross-system update Data availability to the enterprise Eliminate spreadsheet data transfer Allows for complete model Provide connection to customer consumption Data validation � feedback loop For those of us working these data integration issues, is always seems why obvious why we should integrate. But I want to list a couple specific reasons here just to further clarify the purpose of this project. You can read the full list on the screen, but for us, the most important reasons from our perspective, is to eliminate the redundant efforts of data entry by the two independent groups , and the inefficiencies of spreadsheet-based data transfers, ands to provide a data validation, or quality feedback loop, from the modelers back to the GIS team.For those of us working these data integration issues, is always seems why obvious why we should integrate. But I want to list a couple specific reasons here just to further clarify the purpose of this project. You can read the full list on the screen, but for us, the most important reasons from our perspective, is to eliminate the redundant efforts of data entry by the two independent groups , and the inefficiencies of spreadsheet-based data transfers, ands to provide a data validation, or quality feedback loop, from the modelers back to the GIS team.

6. 6 Modeling types Skeletonized/reduced vs. all-pipes model Skeletonized is a simplified view of the system Improves model performance for large systems All-pipes allowable due to PC/IT improvements System model vs. daily model System model includes existing and future Model is updated on-demand by the engineer Daily model is an operational model Real time updates Existing infrastructure For those in the room not entirely up to speed on engineering capacity modeling, its probably worthwhile to go over some basic modeling terminology. Skeletonized hydraulic models provide a simplified view of the overall water distribution system, and may, for example, only include pipes or features over a certain diameter, weeding out all of the smaller pipes that have less hydraulic impact. Or it may include only certain types of fittings, again, including only the ones that have the most hydraulic impact. Historically, this was done for performance reasons. However, with the vast improvements in computer speed over the past decades, it is now becoming much more realistic to go ahead and build what is called an �all-pipes model�, one that includes every single pipe, node, tank, pump, reservoir and valve in the system, in all of its great and gory detail, and that�s what we intend to do at CSU. InfoWater provides internal skeletonizing processes, so the engineers will have that option if their project dictates that need. Other model types include system models and daily models. System models contain as-built, existing infrastructure as well as future proposed features. For many, that�s actually the main purpose of a model, to play what-if games with future plans. System models are a planning tool, and may only need to be updated a couple times a year. Whereas, a daily, or operational model, keeps track of the dynamic state of the WATER distribution system and focuses on what is currently in place, in the ground. Initially, we plan to integrate to a system model. The integration requirements are much simpler and an upgrade path to a daily model will be available, because the daily model data is a subset of the system model.For those in the room not entirely up to speed on engineering capacity modeling, its probably worthwhile to go over some basic modeling terminology. Skeletonized hydraulic models provide a simplified view of the overall water distribution system, and may, for example, only include pipes or features over a certain diameter, weeding out all of the smaller pipes that have less hydraulic impact. Or it may include only certain types of fittings, again, including only the ones that have the most hydraulic impact. Historically, this was done for performance reasons. However, with the vast improvements in computer speed over the past decades, it is now becoming much more realistic to go ahead and build what is called an �all-pipes model�, one that includes every single pipe, node, tank, pump, reservoir and valve in the system, in all of its great and gory detail, and that�s what we intend to do at CSU. InfoWater provides internal skeletonizing processes, so the engineers will have that option if their project dictates that need. Other model types include system models and daily models. System models contain as-built, existing infrastructure as well as future proposed features. For many, that�s actually the main purpose of a model, to play what-if games with future plans. System models are a planning tool, and may only need to be updated a couple times a year. Whereas, a daily, or operational model, keeps track of the dynamic state of the WATER distribution system and focuses on what is currently in place, in the ground. Initially, we plan to integrate to a system model. The integration requirements are much simpler and an upgrade path to a daily model will be available, because the daily model data is a subset of the system model.

7. 7 Data requirements for modeling GIS Gravity main Fitting Hydrant Line valve Change of condition Meter station Control valve Storage facility Pump Production well So let�s take a look at the data requirements of an InfoWater capacity model. As you may or may not know, any engineering capacity model is essentially a mathematical or logical model of a physical system, such as a water distribution system. It allows an engineer to input different scenarios and estimate a potential outcome. InfoWater is based on an underlying modeling engine called EPANet, written by the EPA. Its foundation is based on a link-node topology. So, by definition, mathematical features are input into a model as either a link or a node, and the engineering properties of each feature determines its hydraulic function or capacity. The available input features for InfoWater include Pipes (as links), Junctions (as nodes), and then a handful of node types where the engineering properties have significant influence on the hydraulic capacity. So, in our case at Colorado Springs Utilities, we went through our GIS feature classes and determined which feature classes should be brought over into the model, and we see those here on the screen. We have gravity mains mapped to pipes, a handful of point feature types mapped over to model junctions, and then some very specific mapping of feature classes over to the model in the form of valves, tanks, pumps and reservoirs. In essence, we have 10 feature classes in the GIS that can be mapped over to the corresponding feature types in InfoWater, used for modeling, and I�ll go into the attribute detail for each of these in the following slides.So let�s take a look at the data requirements of an InfoWater capacity model. As you may or may not know, any engineering capacity model is essentially a mathematical or logical model of a physical system, such as a water distribution system. It allows an engineer to input different scenarios and estimate a potential outcome. InfoWater is based on an underlying modeling engine called EPANet, written by the EPA. Its foundation is based on a link-node topology. So, by definition, mathematical features are input into a model as either a link or a node, and the engineering properties of each feature determines its hydraulic function or capacity. The available input features for InfoWater include Pipes (as links), Junctions (as nodes), and then a handful of node types where the engineering properties have significant influence on the hydraulic capacity. So, in our case at Colorado Springs Utilities, we went through our GIS feature classes and determined which feature classes should be brought over into the model, and we see those here on the screen. We have gravity mains mapped to pipes, a handful of point feature types mapped over to model junctions, and then some very specific mapping of feature classes over to the model in the form of valves, tanks, pumps and reservoirs. In essence, we have 10 feature classes in the GIS that can be mapped over to the corresponding feature types in InfoWater, used for modeling, and I�ll go into the attribute detail for each of these in the following slides.

8. 8 Data requirements for modeling Pipes (links) Location (SHAPE field) Modeling properties Length Diameter Roughness Material type Age (based on year installed) Type (hydrant lateral vs. not) So for Pipes, we have geometry and locational information stored in the SHAPE field, by default, in any GIS. The engineering properties required for capacity modeling include Length, diameter, a roughness coefficient, and that coefficient is often based on a combination of the pipe material type and age. Some modelers also base this on diameter. For any non-engineers here today, consider that a brand new 12� PVC pipe can transport a certain amount of water at a certain pressure in a certain amount of time. However, an 60-year old 12� metal pipe that may be heavily corroded on its interior walls is actually much rougher on the inside, and this affects hydraulic capacity, effectively decreasing its capacity. Another pipe characteristic that needs to be considered for modeling is whether or not the pipe is a hydrant lateral, as this helps in determine how to apply demand, or water usage, which I�ll talk more about on the next slide. So for Pipes, we have geometry and locational information stored in the SHAPE field, by default, in any GIS. The engineering properties required for capacity modeling include Length, diameter, a roughness coefficient, and that coefficient is often based on a combination of the pipe material type and age. Some modelers also base this on diameter. For any non-engineers here today, consider that a brand new 12� PVC pipe can transport a certain amount of water at a certain pressure in a certain amount of time. However, an 60-year old 12� metal pipe that may be heavily corroded on its interior walls is actually much rougher on the inside, and this affects hydraulic capacity, effectively decreasing its capacity. Another pipe characteristic that needs to be considered for modeling is whether or not the pipe is a hydrant lateral, as this helps in determine how to apply demand, or water usage, which I�ll talk more about on the next slide.

9. 9 Data requirements for modeling Junctions (fittings, hydrants, line valves) Location (x,y) Modeling properties Demand (customer count and consumption) Elevation Valve type or fitting type Age (general information) Nodes in a link-node topology are essentially just the connections points of the system network, and would be equivalent to pipe fittings, tees, line valves, couplings, etc. These are all features found in our GIS. Each node has a location in the form of x,y coordinates AND some useful engineering properties can be derived in GIS based on that location. For example, capacity modeling deals with customer demand or in this case, water usage. In a GIS, we can link our customers water usage from our Customer Billing System over to the actual customer location and then sum up a group of customers in a single area, for example. Based on this summary data from this group of customers we can develop a picture of how much water is flowing through a particular point in the pipe network, because performing spatial joins on point data is a relatively easy task for a GIS. Likewise, it is fairly easy to determine the elevation of point features by simply overlaying the point data against an elevation grid, or digital terrain model, or by interpolating between contours. And finally, the TYPE of fitting or valve may have an impact the hydraulic model. A good example of this would be comparing the water flowing through a straight, inline connection such as a coupling, to the water flowing into the blunt end of a 3-way tee intersection. The tee is obviously much more likely to have a hydraulic impact, although I�m not an engineer myself, so I can�t really tell you to what degree those impacts are. Another useful piece of information for the engineer may be the age of the fitting or valve, and this is easily pulled over from the GIS. Nodes in a link-node topology are essentially just the connections points of the system network, and would be equivalent to pipe fittings, tees, line valves, couplings, etc. These are all features found in our GIS. Each node has a location in the form of x,y coordinates AND some useful engineering properties can be derived in GIS based on that location. For example, capacity modeling deals with customer demand or in this case, water usage. In a GIS, we can link our customers water usage from our Customer Billing System over to the actual customer location and then sum up a group of customers in a single area, for example. Based on this summary data from this group of customers we can develop a picture of how much water is flowing through a particular point in the pipe network, because performing spatial joins on point data is a relatively easy task for a GIS. Likewise, it is fairly easy to determine the elevation of point features by simply overlaying the point data against an elevation grid, or digital terrain model, or by interpolating between contours. And finally, the TYPE of fitting or valve may have an impact the hydraulic model. A good example of this would be comparing the water flowing through a straight, inline connection such as a coupling, to the water flowing into the blunt end of a 3-way tee intersection. The tee is obviously much more likely to have a hydraulic impact, although I�m not an engineer myself, so I can�t really tell you to what degree those impacts are. Another useful piece of information for the engineer may be the age of the fitting or valve, and this is easily pulled over from the GIS.

10. 10 Data requirements for modeling Valves (specifically pressure control valves) Location (x,y) Modeling properties Elevation Pressure settings Age (general information) Pressure control valves are also treated as nodes in the model, similar to standard nodes. However, pressure control valves don�t just pass water from one side to another, but they actually control or regulate the pressure. So, the pressure of the water coming into the valve is different than the pressure coming out of the other side, and this is by design. So, these engineering properties need to be captured as an attribute. Unlike the previous slide where I mentioned the empirical capacity of different types of fittings and valves, pressure control valves would have explicit values set for their design pressure.Pressure control valves are also treated as nodes in the model, similar to standard nodes. However, pressure control valves don�t just pass water from one side to another, but they actually control or regulate the pressure. So, the pressure of the water coming into the valve is different than the pressure coming out of the other side, and this is by design. So, these engineering properties need to be captured as an attribute. Unlike the previous slide where I mentioned the empirical capacity of different types of fittings and valves, pressure control valves would have explicit values set for their design pressure.

11. 11 Data requirements for modeling Tanks Location (x,y) Modeling properties Diameter and volume Base elevation Min/max/initial water level Age (general information) Tanks are water storage features and so from a modeling perspective obviously we would need to know the location, the size and the volume of the tank. However, when tanks are used in a gravity fed system, such as in Colorado Springs, elevation plays a strong role in how the system functions hydraulically. So we need to store the elevation of the tank itself, but then we also need to provide some additional attributes for how the water level within the tank changes over time. Tanks are water storage features and so from a modeling perspective obviously we would need to know the location, the size and the volume of the tank. However, when tanks are used in a gravity fed system, such as in Colorado Springs, elevation plays a strong role in how the system functions hydraulically. So we need to store the elevation of the tank itself, but then we also need to provide some additional attributes for how the water level within the tank changes over time.

12. 12 Data requirements for modeling Pumps Location (x,y) Modeling properties Type Elevation Horsepower Design head and design flow Age (general information) Pumps are also modeled as point features. Essentially, they pump water into a system, and that pumping ability is based on the type of pump, the elevation, the horsepower rating and design flow.Pumps are also modeled as point features. Essentially, they pump water into a system, and that pumping ability is based on the type of pump, the elevation, the horsepower rating and design flow.

13. 13 Data requirements for modeling Reservoirs (Wells at Colorado Springs Utilities) Location (x,y) Modeling properties Type Head Pattern (depending on pump) Depth Capacity While many utilities throughout the country might use a reservoir as a direct input into a water distribution system, all of the reservoirs owned or maintained by Colorado Springs Utilities actually sit above the water treatment plants, so they do not provide a direct feed into the hydraulics of the model. The model hydraulics actually start at the water tanks. In Colorado Springs, we do have one well that feeds directly into a non-potable system, so attributes that we would be interested in would be things like head pressure, well depth, capacity, etc. I�m not going to dwell on this too much because for the purpose of our modeling integration, we�re really just focused on the potable system at this time. The non-pot system is very small and is actually modeled separately from the potable system.While many utilities throughout the country might use a reservoir as a direct input into a water distribution system, all of the reservoirs owned or maintained by Colorado Springs Utilities actually sit above the water treatment plants, so they do not provide a direct feed into the hydraulics of the model. The model hydraulics actually start at the water tanks. In Colorado Springs, we do have one well that feeds directly into a non-potable system, so attributes that we would be interested in would be things like head pressure, well depth, capacity, etc. I�m not going to dwell on this too much because for the purpose of our modeling integration, we�re really just focused on the potable system at this time. The non-pot system is very small and is actually modeled separately from the potable system.

14. 14 System comparisons Linear feature comparisons Point feature comparisons So, with all of that as an intro, we�re now going to get into the real meat of the project that I worked on, which was the qualitative comparison of these two disparate systems. Again, these two systems evolved independent of one another, the model originated in a CAD-based engineering environment and the GIS originated in the IT department. The ultimate goal here is to be able to take one system, the system of record, and overwrite the other system any time it needs to be brought up to date. Specifically, we�re talking about overwriting the model base data with information stored in the GIS. While both systems do contain high quality attribute data, only the GIS contains high quality spatial accuracy, because the GIS group makes use of GPS in their data entry workflow, whereas the model data entry is usually just roughed in from preliminary design drawings or heads-up digitizing. So, in our comparison, we have to compare both the linear features, and all of the point feature types. In the image shown on the screen, I first laid down all GIS features and symbolized them in red. I then overlaid all of the model features, symbolized in blue. Where the blue features mask out the red ones, we can see that we have a pretty good correlation between the two systems. However, you can also see there are a lot more red features where blue ones don�t exist. This could be because the engineer didn�t feel that it was necessary to enter those features into the model, perhaps because it did not have hydraulic significance, kind of like the skeletonized model issue I presented earlier. Actually, when we start looking at the quantitative differences between the two systems, looking at linear features alone, we see a full order of magnitude difference between the two systems. Where we have about 208,000 water pipes in GIS, we only have about 24,000 pipes in the model. This, again, is due to both skeletonization and reduction. So, with all of that as an intro, we�re now going to get into the real meat of the project that I worked on, which was the qualitative comparison of these two disparate systems. Again, these two systems evolved independent of one another, the model originated in a CAD-based engineering environment and the GIS originated in the IT department. The ultimate goal here is to be able to take one system, the system of record, and overwrite the other system any time it needs to be brought up to date. Specifically, we�re talking about overwriting the model base data with information stored in the GIS. While both systems do contain high quality attribute data, only the GIS contains high quality spatial accuracy, because the GIS group makes use of GPS in their data entry workflow, whereas the model data entry is usually just roughed in from preliminary design drawings or heads-up digitizing. So, in our comparison, we have to compare both the linear features, and all of the point feature types. In the image shown on the screen, I first laid down all GIS features and symbolized them in red. I then overlaid all of the model features, symbolized in blue. Where the blue features mask out the red ones, we can see that we have a pretty good correlation between the two systems. However, you can also see there are a lot more red features where blue ones don�t exist. This could be because the engineer didn�t feel that it was necessary to enter those features into the model, perhaps because it did not have hydraulic significance, kind of like the skeletonized model issue I presented earlier. Actually, when we start looking at the quantitative differences between the two systems, looking at linear features alone, we see a full order of magnitude difference between the two systems. Where we have about 208,000 water pipes in GIS, we only have about 24,000 pipes in the model. This, again, is due to both skeletonization and reduction.

15. 15 Linear system comparison So let�s start with the linear system comparison first. Actually, quite a bit of work has been done in this space, including a fair amount of academic research. Unfortunately, none of what I came across was really applicable to the types of situations that I found in our utility data. There are also a lot of applications out there for automating the transfer of attributes between different linear systems. This process is called conflation. Unfortunately, the applications are geared primarily towards the highway industry, for example, comparing county street data to state street data. For fully automated conflation, the feature counts need to be roughly similar between the two systems, and if they�re not, then a much more manual inspection of the data is needed. Also, the applications are quite expensive, so I haven�t really been able to test whether or not these applications would work for Colorado Springs Utilities. So, I set about to develop my own methods for data conflation between our pipe networks, using some standard tools available within ArcMap. If we look at the two lines on this slide, I�ve got one line made up of 4-vertices shown in black that represents a pipe in the hydraulic model. Next to that, I have a similar 4-vertice line that represents the same feature but coming from the GIS. The goal here is to compare these two features and see if their attributes are the same. But how would you do that? Spatial joins work well for points and polygons, but they don�t really make much sense when you�re working with line segments. And the NEAR function in ArcMap requires that at least one of your inputs is a point feature. But we�re trying to compare 2 line features here. However, we can convert lines into points, where the point represents the centroid of the envelope that contains the line feature, and it carries along all of the attributes of the feature. This is shown on the slide as a crosshair in the middle of the image. But as you can see, the centroid point feature isn�t really all that close to the line feature when you�re dealing with a complex line that has an organic shape to it. This offset distance could become very problematic when applied to utility data where we typically have very dense infrastructure and lines tend to crisscross each all over the system.So let�s start with the linear system comparison first. Actually, quite a bit of work has been done in this space, including a fair amount of academic research. Unfortunately, none of what I came across was really applicable to the types of situations that I found in our utility data. There are also a lot of applications out there for automating the transfer of attributes between different linear systems. This process is called conflation. Unfortunately, the applications are geared primarily towards the highway industry, for example, comparing county street data to state street data. For fully automated conflation, the feature counts need to be roughly similar between the two systems, and if they�re not, then a much more manual inspection of the data is needed. Also, the applications are quite expensive, so I haven�t really been able to test whether or not these applications would work for Colorado Springs Utilities. So, I set about to develop my own methods for data conflation between our pipe networks, using some standard tools available within ArcMap. If we look at the two lines on this slide, I�ve got one line made up of 4-vertices shown in black that represents a pipe in the hydraulic model. Next to that, I have a similar 4-vertice line that represents the same feature but coming from the GIS. The goal here is to compare these two features and see if their attributes are the same. But how would you do that? Spatial joins work well for points and polygons, but they don�t really make much sense when you�re working with line segments. And the NEAR function in ArcMap requires that at least one of your inputs is a point feature. But we�re trying to compare 2 line features here. However, we can convert lines into points, where the point represents the centroid of the envelope that contains the line feature, and it carries along all of the attributes of the feature. This is shown on the slide as a crosshair in the middle of the image. But as you can see, the centroid point feature isn�t really all that close to the line feature when you�re dealing with a complex line that has an organic shape to it. This offset distance could become very problematic when applied to utility data where we typically have very dense infrastructure and lines tend to crisscross each all over the system.

16. 16 Linear system comparison In the case of our hydraulic model data, if we convert every pipe to a centroid, we would end up with roughly 24,000 points. But again, those points won�t always be that close to the line in the GIS that we want to compare them to. Instead, we can take every line feature in the model and explode the line work into individual line segments, using the SPLIT LINE AT VERTICES tool. In the case of our data at Colorado Springs Utilities, this gave us about 120,000 line segments, each having a much more compact envelope. If we then convert each of these into centroids, using the FEATURE TO POINT tool, we end up with centroids that are much closer to sitting directly on top of the GIS feature that we are trying to compare to, shown in the center part of this slide. So, to summarize, what we�re really interested in for comparison purposes are the features shown at the bottom. The blue line is the water pipe from the GIS, and the black crosshairs are the centroids of the same pipe from the model. And the centroid point feature carries along the attributes from the model.In the case of our hydraulic model data, if we convert every pipe to a centroid, we would end up with roughly 24,000 points. But again, those points won�t always be that close to the line in the GIS that we want to compare them to. Instead, we can take every line feature in the model and explode the line work into individual line segments, using the SPLIT LINE AT VERTICES tool. In the case of our data at Colorado Springs Utilities, this gave us about 120,000 line segments, each having a much more compact envelope. If we then convert each of these into centroids, using the FEATURE TO POINT tool, we end up with centroids that are much closer to sitting directly on top of the GIS feature that we are trying to compare to, shown in the center part of this slide. So, to summarize, what we�re really interested in for comparison purposes are the features shown at the bottom. The blue line is the water pipe from the GIS, and the black crosshairs are the centroids of the same pipe from the model. And the centroid point feature carries along the attributes from the model.

17. 17 Linear system comparison Explode model pipes into line segments Convert segments to centroids (w/ attributes) Spatial join (limit20�) Compare attributes Restated, I exploded all pipes from the model and converted those line segments to a point feature class, carrying along the attributes, then performed a spatial join to the nearest pipe, limiting my search distance to 20 feet. The resulting feature class now has the attributes from both sets of pipe data. The beauty of doing this in ArcMap, using standard tools, is that I can throw the entire process into a ModelBuilder model, and re-run it at any time to check the progress of the data cleanup effort. Restated, I exploded all pipes from the model and converted those line segments to a point feature class, carrying along the attributes, then performed a spatial join to the nearest pipe, limiting my search distance to 20 feet. The resulting feature class now has the attributes from both sets of pipe data. The beauty of doing this in ArcMap, using standard tools, is that I can throw the entire process into a ModelBuilder model, and re-run it at any time to check the progress of the data cleanup effort.

18. 18 Linear system comparison Here�s a screenshot from ArcMap showing some of the results that came out of doing the linear feature comparison. The green lines represent the water pipes from the GIS. The red dots on top of that are the point features that came out of the spatial join. Once the spatial join was complete, I added a couple attributes, shown out on the far right of the table, representing the differences found when comparing similar attributes. In this case, the only data I was able to compare was the date that the pipe was installed in the ground and the diameter of the pipe. These were the only two attributes explicitly stored in both systems. So the additional attributes are date_diff and diam_diff. I highlighted a random record that shows where some of the differences between the GIS and the model may be significant. The value of calcing the diff fields is that we can also categorize the deltas and symbolize the range of differences per attribute and if you look at the symbology legend in the upper left, you can see that I categorized the layer using graduated symbols, based on the diameter difference. In the highlighted case, the diameter difference between the GIS and the model was 32 inches and the date difference was 10 years. Once this data is calculated, we can sit down with the engineer and decide to either visit every case where there is a difference, or to prioritize the data cleanup effort based on the magnitude of the variance. This is the stage we are at now, meeting with the engineers to determine priority. This analysis resulted in about 5500 pipes that did not agree on diameter, between the model and the GIS. To put this in perspective, that�s only 2-3 % of the entire system. However, as I alluded to earlier, the data quality in the GIS is a barrier to us moving forward with the integration.Here�s a screenshot from ArcMap showing some of the results that came out of doing the linear feature comparison. The green lines represent the water pipes from the GIS. The red dots on top of that are the point features that came out of the spatial join. Once the spatial join was complete, I added a couple attributes, shown out on the far right of the table, representing the differences found when comparing similar attributes. In this case, the only data I was able to compare was the date that the pipe was installed in the ground and the diameter of the pipe. These were the only two attributes explicitly stored in both systems. So the additional attributes are date_diff and diam_diff. I highlighted a random record that shows where some of the differences between the GIS and the model may be significant. The value of calcing the diff fields is that we can also categorize the deltas and symbolize the range of differences per attribute and if you look at the symbology legend in the upper left, you can see that I categorized the layer using graduated symbols, based on the diameter difference. In the highlighted case, the diameter difference between the GIS and the model was 32 inches and the date difference was 10 years. Once this data is calculated, we can sit down with the engineer and decide to either visit every case where there is a difference, or to prioritize the data cleanup effort based on the magnitude of the variance. This is the stage we are at now, meeting with the engineers to determine priority. This analysis resulted in about 5500 pipes that did not agree on diameter, between the model and the GIS. To put this in perspective, that�s only 2-3 % of the entire system. However, as I alluded to earlier, the data quality in the GIS is a barrier to us moving forward with the integration.

19. 19 Linear system comparison So, here�s a couple more examples of how the data did not agree between the two systems. I have labeled and symbolized the GIS data in red and the model in green. What�s interesting here is the we see an example of the very early stages of our GIS, when we used to use the value 99 to represent unknowns. In this case, the GIS pipe has a diameter of 99 where the model is coded as 12 inch. The zeros, NULLs and 99s are an obvious data QA issue that needs to be dealt with, and we recognize this as an error in the GIS.So, here�s a couple more examples of how the data did not agree between the two systems. I have labeled and symbolized the GIS data in red and the model in green. What�s interesting here is the we see an example of the very early stages of our GIS, when we used to use the value 99 to represent unknowns. In this case, the GIS pipe has a diameter of 99 where the model is coded as 12 inch. The zeros, NULLs and 99s are an obvious data QA issue that needs to be dealt with, and we recognize this as an error in the GIS.

20. 20 Linear system comparison Conversely, the model itself is also not without errors. Here we see that the modeler entered a typo and coded an 80 inch pipe where t should be 8 inch. So, both systems, do in fact have errors present, which to me, really validates what we�re attempting to do � to clean up both systems as we move towards integration.Conversely, the model itself is also not without errors. Here we see that the modeler entered a typo and coded an 80 inch pipe where t should be 8 inch. So, both systems, do in fact have errors present, which to me, really validates what we�re attempting to do � to clean up both systems as we move towards integration.

21. 21 Linear system comparison Here is an example of a pump station, and this is a particular area where the comparison techniques really fall apart. We have SOME detail of the pipe system within the pump station, however, the modeler does some very unique data entry to represent pumps and their variable capacities. Given that we have less than 100 of these scenarios in our system, it is much more practical to simply visit each one and the necessary data QA, instead of trying to develop an automated method.Here is an example of a pump station, and this is a particular area where the comparison techniques really fall apart. We have SOME detail of the pipe system within the pump station, however, the modeler does some very unique data entry to represent pumps and their variable capacities. Given that we have less than 100 of these scenarios in our system, it is much more practical to simply visit each one and the necessary data QA, instead of trying to develop an automated method.

22. 22 Pipe roughness coefficient Rule-based So, the previous slides were some of the values that were explicitly set in each system that we could query for errors. However, the pipe material, which goes into the roughness calculation, was not coded in both systems, and actually, if we look at how one of our consultants coded roughness, we see that he actually only used diameter and vintage to estimate roughness for each pipe. So, I used this information to derive a new value in GIS of what I would expect to find in the model for this attribute, and then compared that value to the model. This gave me some fairly interesting results.So, the previous slides were some of the values that were explicitly set in each system that we could query for errors. However, the pipe material, which goes into the roughness calculation, was not coded in both systems, and actually, if we look at how one of our consultants coded roughness, we see that he actually only used diameter and vintage to estimate roughness for each pipe. So, I used this information to derive a new value in GIS of what I would expect to find in the model for this attribute, and then compared that value to the model. This gave me some fairly interesting results.

23. 23 Pipe roughness I coded the symbology based on the difference in the value between GIS and the model, and categorized those differences based on standard deviation, although every categorization method produced a similar image to what we�re seeing on the screen. In fact, we have a very high variance from the consultants chart, specifically in the oldest part of Colorado Springs. And when I went in and looked at these values, about 2% had a roughness of less than 100, so they had been custom entered by the engineer, at some time after the consultants study, which was done in 1999.I coded the symbology based on the difference in the value between GIS and the model, and categorized those differences based on standard deviation, although every categorization method produced a similar image to what we�re seeing on the screen. In fact, we have a very high variance from the consultants chart, specifically in the oldest part of Colorado Springs. And when I went in and looked at these values, about 2% had a roughness of less than 100, so they had been custom entered by the engineer, at some time after the consultants study, which was done in 1999.

24. 24 Results of linear comparison 5944 locations where pipes diameters do not match (2.6%) 1440 locations � diameter delta > 4� 5101 locations where install date do not match 2731 locations where date delta > 10 years Hydraulic model material attribute does not support a useful comparison Roughness coefficient of 3256 old pipes is custom Only auto-update new pipes 1907 geometric network junctions Linear system comparison So, here is a summary of the issues found when comparing the linear feature between the two systems. 5900 locations, which translated to about 5500 pipe diameters that did not match. About a quarter of those had a variance greater than 4 inches. We have about 5000 locations where the install date doesn�t match, and about half of those vary by more than 10 years. We derived some roughness coefficients based on the previous consultants� method and found that a little over 3000 pipes didn�t match, and that these were almost exclusively found in the oldest parts of the city, where the engineer has done some custom modeling. So, we still need to find a way to preserve these values and not allow them to be overwritten by the GIS.So, here is a summary of the issues found when comparing the linear feature between the two systems. 5900 locations, which translated to about 5500 pipe diameters that did not match. About a quarter of those had a variance greater than 4 inches. We have about 5000 locations where the install date doesn�t match, and about half of those vary by more than 10 years. We derived some roughness coefficients based on the previous consultants� method and found that a little over 3000 pipes didn�t match, and that these were almost exclusively found in the oldest parts of the city, where the engineer has done some custom modeling. So, we still need to find a way to preserve these values and not allow them to be overwritten by the GIS.

25. 25 Junctions comparison Hydrant - 12,546 in the model 98 as-built hydrants not co-located in GIS Elevation Tanks 41 total Already cleaned up to 100% match Pumps, valves and reservoirs No significant comparable attributes Small number � visit manually Node comparison results So, I�ve spent a lot of time in this presentation discussing how the linear system was compared and the techniques that were used. In looking at our nodes, fortunately, points are much easier to analyze in GIS, using the near function and from the simple standpoint that point features are much more likely to be co-located. Unfortunately, we didn�t really have many attributes that could actually be compared, because they were only present in one system or the other. Rather than doing much in the way of data comparison, we simply need to know which attributes are required for modeling, and begin populating those in the GIS, and those attributes were discussed in previous slides. A few results are presented here without much commentary about the methods used � very similar to what we did with the linear features. And really, I just want to highlight the most interesting aspects of analyzing the point features. We had about 12,500 hydrants present in the model, compared to the 22,000 found in the GIS. And about 100 of these were not found to be within 100� of a corresponding hydrant in the GIS. And some were even more than 1000 feet away, and this is looking purely at the as-built hydrants, excluding the future plans. So we still need to take a look at why this is and whether it is a concern going forward. The most interesting analysis performed against the junction nodes, had to do with there model elevation attribute as compared to the latest digital terrain model. So, I�ve spent a lot of time in this presentation discussing how the linear system was compared and the techniques that were used. In looking at our nodes, fortunately, points are much easier to analyze in GIS, using the near function and from the simple standpoint that point features are much more likely to be co-located. Unfortunately, we didn�t really have many attributes that could actually be compared, because they were only present in one system or the other. Rather than doing much in the way of data comparison, we simply need to know which attributes are required for modeling, and begin populating those in the GIS, and those attributes were discussed in previous slides. A few results are presented here without much commentary about the methods used � very similar to what we did with the linear features. And really, I just want to highlight the most interesting aspects of analyzing the point features. We had about 12,500 hydrants present in the model, compared to the 22,000 found in the GIS. And about 100 of these were not found to be within 100� of a corresponding hydrant in the GIS. And some were even more than 1000 feet away, and this is looking purely at the as-built hydrants, excluding the future plans. So we still need to take a look at why this is and whether it is a concern going forward. The most interesting analysis performed against the junction nodes, had to do with there model elevation attribute as compared to the latest digital terrain model.

26. 26 Elevation data comparison The point feature elevation attributes were not populated in the GIS. But this is fairly simple to do and we should be able to auto-populate this attribute across all point feature classes in a matter of minutes. So, rather than do that, and then run a comparison, I actually chose to take the junctions feature class from the model and overlay it with the latest digital terrain model, and calculate the expected elevation. This digital terrain model�s accuracy is plus or minus 1�, however, as we can see from the image on the screen, there are some edge effects that may be present in the outer perimeter. I calculated the expected elevation, using the �Surface Spot tool in 3D Analyst�, and weeded out all elevations that fell outside of the DTM. About 1300 (1303) junctions in the model off by more than 10 feet vertically. About 400 (405) junctions in the model off by more than 20 feet vertically. The point feature elevation attributes were not populated in the GIS. But this is fairly simple to do and we should be able to auto-populate this attribute across all point feature classes in a matter of minutes. So, rather than do that, and then run a comparison, I actually chose to take the junctions feature class from the model and overlay it with the latest digital terrain model, and calculate the expected elevation. This digital terrain model�s accuracy is plus or minus 1�, however, as we can see from the image on the screen, there are some edge effects that may be present in the outer perimeter. I calculated the expected elevation, using the �Surface Spot tool in 3D Analyst�, and weeded out all elevations that fell outside of the DTM. About 1300 (1303) junctions in the model off by more than 10 feet vertically. About 400 (405) junctions in the model off by more than 20 feet vertically.

27. 27 Issues and next steps Date fields � InfoWater stores dates as YYYY, GIS stores dates as MM/DD/YYYY All coded domain must be decoded for interpretation LID or LinkID � to be determined and quite problematic Diameters � search and replace all NULLs and 0�s with valid values Feature elevations � derive in GIS Pump flow rates are entered in comment field � inconsistent Preserve custom model attribution Demand allocation is a full project Data QA is an �exploratory process� Global system issuesGlobal system issues

28. 28 Questions? Will Allender wallender@csu.org Special thanks to Penn State Dr. Patrick M. Reed � Technical Advisor Dr. Doug Miller � Academic Advisor I would like to thank my advisors for their guidance and involvement during this project, and open it up for questions. Please keep in mind, I�m not a hydraulic engineer, I�m a GIS professional. So, if you have some specific hydraulic modeling questions, I may ask you to direct those to our MWH InfoWater people in the room.I would like to thank my advisors for their guidance and involvement during this project, and open it up for questions. Please keep in mind, I�m not a hydraulic engineer, I�m a GIS professional. So, if you have some specific hydraulic modeling questions, I may ask you to direct those to our MWH InfoWater people in the room.

GIS

GIS

Presentation Transcript

GIS,GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS

GIS Lecture: Sharing GIS

GIS

GIS