Middleware:High Technical Bandwidth, High Political Latency Ken Klingenstein, Project Director, Internet2 Middleware Initiative Chief Technologist, University of Colorado at Boulder
Topics • Acknowledgments • What is Middleware • Core middleware: the basic technologies • Identifiers • Authentication • Directories • PKI • The Gathering Clouds (aka Tightly-Knit Vapor) • Eduperson, the Directory of Directories, Shibboleth, HEPKI • What to do and where to watch?
Other middleware sessions Mware 101 - big picture, identifier basics, authn, directory concepts, PKI overview PKI 101 - apps, certs, profiles, policies, trust models Early Adopters technology --- policy Academic Medical Middleware International Issues in Middleware Labs: Eduperson Shibboleth DoD, apps Mware 202 - identifiers+, directory deployments Middleware 301 - metadirectories, registries, authorization HEPKI- PAG - current policy activities Middleware and the Grid HEPKI- TAG - current technical activities LDAP Recipe Metadirectories BoF Multicampus BoF
Mace and the working groups Early Harvest - NSF catalytic grant and meeting Early Adopters Higher Ed partners -campuses, EDUCAUSE, CREN, AACRAO, NACUA, etc Corporate partners - IBM, ATT, SUN, et al... Gov’t partners - including NSF and the fPKI TWG Acknowledgements
Mace (Middleware Architecture Committee for Education) • Purpose - to provide advice, create experiments, foster standards, etc. on key technical issues for core middleware within higher ed • Membership - Bob Morgan (UW) Chair, Steven Carmody (Brown), Michael Gettes (Georgetown), Keith Hazelton (Wisconsin), Paul Hill (MIT), Jim Jokl (Virginia), Mark Poepping (CMU), David Wasley (California), Von Welch (Grid) • Creates working groups in major areas, including directories, interrealm authentication, PKI, medical issues, etc. • Works via conference calls, emails, occasional serendipitous in-person meetings...
Early Harvest • NSF funded workshop in Fall 99 and subsequent activities • Defined the territory and established a work plan • Best practices in identifiers, authentication, and directories (http://middleware.internet2.edu/best-practices.html) • http://middleware.internet2.edu/earlyharvest/
Early Adopters: The Campus Testbed Phase • A variety of roles and missions • Commitment to move implementation forward • Provided some training and facilitated support • Develop national models of deployment alternatives • Address policy standards • Profiles and plans are on I2 middleware site.
Dartmouth U Hawaii Johns Hopkins Univ of Maryland, BC Univ of Memphis Univ of Michigan Michigan Tech Univ Univ of Pittsburgh Univ of Southern Cal Tufts Univ Univ of Tennessee, Memphis Early Adopter Participants
The proliferation of customizable applications requires a centralization of “customizations” The increase in power and complexity of the network requires access to user profiles Electronic personal security services is now an impediment to the next-generation computing grids Inter-institutional applications require interoperational deployments of institutional directories and authentication Remedial IT architecture
What is Middleware? • specialized networked services that are shared by applications and users • a set of core software components that permit scaling of applications and networks • tools that take the complexity out of application integration • sits above the network as the second layer of the IT infrastructure • a land where technology meets policy • the intersection of what networks designers and applications developers each do not want to do
Specifically, • Digital libraries need scalable, interoperable authentication and authorization. • The Grid as the new paradigm for a computational resource, with Globus the middleware, including security, location and allocation of resources, scheduling, etc. This relies on campus-based services and inter-institutional standards • Instructional Management Systems (IMS) need authentication and directories • Next-generation portals want common authentication and storage • Academic collaboration requires restricted sharing of materials between institutions • What I1 did with communication, I2 may do with collaboration
A Map of Middlewareland Academic Computing Upperware Research Oriented Upperware Business Upperware Core middleware Network-layer middleware
The Grid • a model for a distributed computing environment, addressing diverse computational resources, distributed databases, network bandwidth, object brokering, security, etc. • Globus (www.globus.org) is the software that implements most of these components; Legion is another such software environment • Needs to integrate with campus infrastructure • Gridforum (www.gridforum.org) umbrella activity of agencies and academics • Look for grids to occur locally and nationally, in physics, earthquake engineering, etc.
Core Middleware • Identity - unique markers of who you (person, machine, service, group) are • Authentication - how you prove or establish that you are that identity • Directories - where an identity’s basic characteristics are kept • Authorization - what an identity is permitted to do • PKI - emerging tools for security services
Cuttings: Identifiers • “Any problem in Computer Science can be solved with another level of indirection” • Butler Lampson • “Except the problem of indirection complexity” • Bob Morgan
UUID Student and/or emplid Person registry id Account login id Enterprise-lan id Student ID card Netid Email address Library/deptl id Publicly visible id (and pseudossn) Pseudonymous id Major campus identifiers
General Identifier Characteristics • Uniqueness (within a given context) • Dumb vs intelligent (i.e. whether subfields have meaning) • Readability (machine vs human vs device) • Affordance (centrally versus locally provided) • Resolver approach (how identifier is mapped to its associated object) • Metadata (both associated with the assignment and resolution of an identifier) • Persistence (permanence of relationship between identifier and specific object) • Granularity (degree to which an identifier denotes a collection or component) • Format (checkdigits) • Versions (can the defining characteristics of an identifier change over time) • Capacity (size limitations imposed on the domain or object range) • Extensibility (the capability to intelligently extend one identifier to be the basis for another identifier).
Important Characteristics • Semantics and syntax- what it names and how does it name it • Domain - who issues and over what space is identifier unique • Revocation - can the subject ever be given a different value for the identifier • Reassignment - can the identifier ever be given to another subject • Opacity - is the real world subject easily deduced from the identifier - privacy and use issues
Identifier Mapping Process • Map campus identifiers against a canonical set of functional needs • For each identifier, establish its key characteristics, including revocation, reassignment, privileges, and opacity • Shine a light on some of the shadowy underpinnings of middleware • A key first step towards the loftier middleware goals
Authentication Options • Password based • Clear text • LDAP • Kerberos (Microsoft or K5 flavors) • Certificate based • Others - challenge-response, biometrics • Inter-realm is now the interesting frontier
Some authentication good practices • Precrack new passwords • Precrack using foreign dictionaries as well as US • Confirm new passwords are different than old • Require password change if possibly compromised • Use shared secrets or positive photo-id to reset forgotten passwords • USmail a one-time password (time-bomb) • In-person with a photo id (some require two) • For remote faculty or staff,, an authorized departmental rep in person coupled with a faxed photo-id. • Initial identification/authentication will emerge as a critical component of PKI
Directory Issues • Applications • Overall architecture • Chaining and referrals, Redundancy and Load Balancing, Replication, synchronization, Directory discovery • The Schema and the DIT • attributes, ou’s, naming, objectclasses, groups • Attributes and indexing • Management • clients, delegation of access control, data feeds
Directory-enabled applications • Email • Account management • Web access controls • Portal support • Calendaring • Grids • QoS and maybe secure multicast
A Campus Directory Architecture Border directory Metadirectory Enterprise directory OS directories (MS, Novell, etc) Departmental directories Dir DB Registries Source systems
Interfaces and relationships with legacy systems Performance in searching Binding to the directory Load balancing and backups are emerging but proprietary Who can read or update what fields How much to couple the enterprise directory with an operating system http://www.georgetown.edu/giia/internet2/ldap-recipe/ Key Architectural Issues
Schema and DIT Good Practices • People, machines, services • Be very flat in people space • Keep accounts as attributes, not as an ou • Replication and group policies should not drive schema • RDN name choices rich and critical • Other keys to index • Creating and preserving unified name spaces
PKI • First thoughts • Fundamentals - Components and Contexts • The missing pieces - in the technology and in the community • Higher Ed Activities (CREN, HEPKI-TAG, HEPKI-PAG, Net@edu, PKI Labs)
PKI: A few observations • Think of it as wall jack connectivity, except it’s connectivity for individuals, not for machines, and there’s no wall or jack…But it is that ubiquitous and important • Does it need to be a single infrastructure? What are the costs of multiple solutions? Subnets and ITPs... • Options breed complexity; managing complexity is essential • PKI can do so much that right now, it does very little.
A few more... • IP connectivity was a field of dreams. We built it and then the applications came. Unfortunately, here the applications have arrived before the infrastructure, making its development much harder. • No one seems to be working on the solutions for the agora.
Uses for PKI and Certificates • authentication and pseudo-authentication • signing docs • encrypting docs and mail • non-repudiation • secure channels across a network • authorization and attributes • secure multicast • and more...
PKI Components • X.509 v3 certs - profiles and uses • Validation - Certificate Revocation Lists, OCSP, path construction • Cert management - generating certs, using keys, archiving and escrow, mobility, etc. • Directories - to store certs, and public keys and maybe private keys • Trust models and I/A • Cert-enabled apps
PKI Contexts for Usage • Intracampus • Within the Higher Ed community of interest • In the Broader World
X.509 certs • purpose - bind a public key to a subject • standard fields • extended fields • profiles to capture prototypes • client and server issues • v2 for those who started too early, v3 for current work, v4 being finalized to address some additional cert formats (attributes, etc.)
Standard fields in certs • cert serial number • the subject, as x.500 DN or … • the subject’s public key • the validity field • the issuer, as id and common name • signing algorithm • signature info for the cert, in the issuer’s private key
Extension fields • Examples - auth/subject subcodes, key usage, LDAP URL, CRL distribution points, etc. • Key usage is very important - for digsig, non-rep, key or data encipherment, etc. • Certain extensions can be marked critical - if an app can’t understand it, then don’t use the cert • Requires profiles to document, and great care...
Cert Management • Certificate Management Protocol - for the creation, revocation and management of certs • Revocation Options - CRL, OCSP • Storage - where (device, directory, private cache, etc.) and how - format (DER,BER, etc.) • Escrow and archive of keys - when, how, and what else needs to be kept • Cert Authority Software or outsource options • Homebrews • Open Source - OpenSSL, OpenCA, Oscar • Third party - Baltimore, Entrust, etc. • OS-integrated - W2K, Sun/Netscape, etc.
Directories • to store certs • to store CRL • to store private keys, for the time being • to store attributes • implement with border directories, or ACLs within the enterprise directory, or proprietary directories
What Isn’t Here Yet… • Scalable revocation • Standard certificate profiles • Certificate Policies and Practice Statements • Interrealm trust structures • Mobility
The Gathering Cloudsaka Tightly-Knit Vapor • PKI - the research labs and HEPKI-TAG, PAG • Eduperson and the LDAP Recipe • the Directory of Directories • Shibboleth
Internet2 PKI Labs • At Dartmouth and Wisconsin in computer science departments and IT organizations • Doing the deep research - two to five years out • Policy languages, path construction, attribute certificates, etc. • National Advisory Board of leading academic and corporate PKI experts provides direction • Catalyzed by startup funding from ATT
HEPKI-TAG • chaired by Jim Jokl, Virginia • certificate profiles • survey of existing uses • development of standard presentation • identity cert standard recommendation • mobility options - SACRED scenarios • public domain software alternatives
HEPKI-PAG • David Wasley, prime mover • draft certificate policy for a campus • HEBCA certificate policy • FERPA • State Legislatures • Gartner Decision Maker software
a directory objectclass intended to support inter-institutional applications fills gaps in traditional directory schema for existing attributes, states good practices where known specifies several new attributes and controlled vocabulary to use as values. provides suggestions on how to assign values, but it is up to the institution to choose. Version 1.0 almost done; one or two revisions anticipated eduPerson
eduperson inherits attributes from person, inetorgperson Some of those attributes need conventions about controlled vocabulary (e.g. telephones) Some of those attributes need ambiguity resolved via a consistent interpretation (e.g. email address) Some of the attributes need standards around indexing and search (e.g. compound surnames) Many of those attributes need access control and privacy decisions (e.g jpeg photo, email address, etc.) Issues about Upper Class Attributes
edupersonAffiliation edupersonPrimaryAffiliation edupersonOrgDN edupersonOrgUnitDN edupersonPrincipalName edupersonNickname edupersonSchoolCollegeName *** New eduPerson Attributes
LDAP Recipe • how to build and operate a directory in higher ed • 1 Tsp. DIT planning 1 Tbsp Schema design 3 oz. configuration 1000 lbs of data • good details, such as tradeoffs/recommendations on indexing, how and when to replicate, etc. • http://www.georgetown.edu/giia/internet2/ldap-recipe/
A Directory of Directories • an experiment to build a combined directory search service • to show the power of coordination • will highlight the inconsistencies between institutions • technical investigation of load and scaling issues, centralized and decentralized approaches • human interfaces issues - searching large name spaces with limits by substring, location, affiliation, etc... • Two different experimental regimes to be tested • centralized indexing and repository with referrals • large-scale parallel searches with heuristics to constrain search space • SUN donation of server and iPlanet license (6,000,000 dn’s) • Michael Gettes, Georgetown project manager
DoD Architecure • Inputs to DoDHE • Inputs: Local Site View • Central Deposit Service • DoDConfig Directory • Operation • Search Operations • Search Drill Down from a list
Inputs Remote Site Directories Remote Data Sources LDAP Oracle Etc… Search Data Filtering & Submit to CDS DoD Config Central Deposit Systems (CDS)
CDS Inputs: Local Site View Submit final LDIF to CDS using authenticated POST via HTTPS. Local Data Source LDAP Filter LDIF according to local policy. Generate new LDIF for submission. DODHE Generate LDIF Data