480 likes | 605 Vues
Interaction models between Query Clients, Information Resources & Discovery Services. Mark Harrison mark.harrison@cantab.net. Assumptions. Connectivity, Availability EPCIS instances generally connected to the Internet and generally reliable
 
                
                E N D
Interaction models between Query Clients, Information Resources & Discovery Services Mark Harrisonmark.harrison@cantab.net
Assumptions Connectivity, Availability • EPCIS instances generally connected to the Internet and generally reliable • EPCIS instances may have downtime (e.g. for maintenance) • Volumes of queries to EPCIS is approximately 10% of the volume of capture events (An EPCIS handles more capture events than query requests) • Address of an EPCIS may change infrequently Trust and Confidentiality • Provider of a Discovery Service is expected to be trustworthyand to act in the interest of the resources (e.g. EPCIS instances)
Requirements • Client Queries must be treated confidentially by a DS • DS records (typically EPC, resource URL) must be treated confidentially by a DS • Latency times and response times should be minimized • The Query Response must be complete - i.e. it must contain all answers by resources that have willingly chosen to provide an answer
Queries & Data - Assumptions • Clients can specify either: • a full query (including EPC and other parameters) OR • only the EPC identifier • The EPC number represents the query key • Resources (e.g. EPCIS) can publish to a DS: • Full EPCIS events (including business step, etc.) OR • EPC identifier only
Queries & Data - Assumptions Discovery Services might hold / handle one of: • (EPC, resource reference e.g. EPCIS URL) • Fully replicated EPCIS events • EPCs of interest to various client (clients' interests) • Full client queries (e.g. EPCIS queries)
Interaction modes • One-off queries • Assist with gathering of historical data (e.g. trace) up to time of query(DS provides referrals or forwards query) • Synchronous response may be possible • Standing queries • Be notified of future updates from new resources (e.g. companies who handle the object at future times) • Only asynchronous notification possible
Interaction mode & Transient Connectivity • Quick response times are a key requirement for a DS (quick ~ up to 5 seconds / synchronous response)(questionnaire / interview responses). • Predictable response time is required • EPCIS resources are likely to have permanent connectivity to the network
Data Ownership & Trust • Data ownership a key concern. • Users may be reluctant to share more than minimum necessary data with a DS - or sharing of additional data should be optional. • Reject models that require the resource owner (e.g. company having an EPCIS) to share detailed information with a DS without first gaining details of which clients require the detailed access and being able to refuse or negotiate this access.
Threats & concerns relating to Discovery Services • Revealing sensitive information (volumes, flows of goods) to unauthorized parties • e.g. where resources lose control over which clients see links • e.g. 'harvesting' of info from client queries by 'honeypot' resources • Excessive network traffic / unnecessary messages • New vulnerabilities / mechanism for Denial of Service attacks? • Slow response times • Inability to provide synchronous response • Waiting for response from underlying / proxy query - and maintaining session state • Manageability / Complexity of specifying access control policies • versus making a separate assessment for each query / each new client • reuse / enforcement at both EPCIS and DS layers of architecture • need for consistency (synchronization?) between a resource's policies at EPCIS & DS layers • A Discovery Service may need to restrict which clients and which resources can use the DS (to limit DoS, honeypot attacks)
Clients, Intermediary & Resources QueryClient Wishes to retrieve information (e.g. event data)from one or more organizations Intermediarye.g. DS Maintains an internal list of associations and can help clients find resources (or vice versa) Resourcee.g.EPCIS Holds information about individual objects Could be an EPCIS - but also web pages,web services, XML data and other service types
A note about the diagrams that follow... FamiliarDirectory-likediagram QueryClient QueryClient Intermediarye.g. DiscoveryService Resourcee.g.EPCIS Resourcee.g.EPCIS Resourcee.g.EPCIS Resourcee.g.EPCIS Resourcee.g.EPCIS
A note about the diagrams that follow... Interaction diagram QueryClient Intermediarye.g. DiscoveryService Resourcee.g.EPCIS Any EPCIS(or even another DSor other kinds of services) Any client N.B.1. Even though the following diagrams only show one client and one resource, this is for clarity, to show the sequence of interactions - it does not imply that only one client or one resource can connect to each DS N.B.2. When a resource is shown sending a message to an intermediary DS, this doesnot require additional EPCIS functionality. The resource may consist of an EPCIS repository and a separate 'DS publishing application' that publishes selected events (or fields derived from EPCIS events) to a Discovery Service.
Different message flow sequences • We'll look at different message flow sequences between Client, Resource & Intermediary (Discovery Service) • ... and analyze their merits in terms of: • impact on performance, • response time to queries, • confidentiality of resource's info • confidentiality of client's query • Consider three phases of interaction...
SetupClient & Resource interact with DS to register interests & capabilities and negotiate security rights DiscoveryProvides either client or resource with sufficient info to initiate service fulfillment Service FulfillmentResource becomes aware of client request and is able to meet it
WithDiscovery phase SynchronousRequest/Response AsynchronousPublish & Subscribe Client is querying Resource is publishing Client Resource Client may be unknown Directory ofResources Notification ofResources Client is publishing Resource is querying Client Resource Resource may beunknown Directory ofClients Notification ofClients
Directory of Resources EPC EPC, URL, serviceType QueryClient Intermediarye.g. DS Resourcee.g.EPCIS EPC, URL URL full EPCIS query Setup EPCIS result-set Discovery Fulfillment
Directory of Clients EPC, URL EPC, Client ID ? EPCs QueryClient Intermediarye.g. DS Resourcee.g.EPCIS EPC, Client ID EPC, Client ID full EPCIS query Setup EPCIS result-set Discovery Fulfillment
Notification of Resources EPC, Client ID EPC, Client ID QueryClient Intermediarye.g. DS Resourcee.g.EPCIS EPC, URL EPC, URL full EPCIS query Setup EPCIS result-set Discovery Fulfillment
Notification of Clients EPC, URL EPC, URL,serviceType EPC QueryClient Intermediarye.g. DS Resourcee.g.EPCIS EPC, Client ID EPC, Client ID full EPCIS query Setup EPCIS result-set Discovery Fulfillment
Withouta distinctDiscovery phase SynchronousRequest/Response AsynchronousPublish & Subscribe Resource to Client Client Resource Meta Resource Notification ofEvents Client to Resource Client Resource Meta Client Query Propagation
Meta Resource full EPCIS query EPCIS events QueryClient Intermediarye.g. DS Resourcee.g.EPCIS EPCIS events EPCIS result-set Setup Fulfillment
Meta Client ClientID, EPCIS queries full EPCIS queryClientID any queries? QueryClient Intermediarye.g. DS Resourcee.g.EPCIS queries,ClientID Setup EPCIS result-set Discovery Fulfillment
Notification of Events full EPCIS queryClientID ClientID, EPCIS queries QueryClient Intermediarye.g. DS Resourcee.g.EPCIS full EPCISevents full EPCISevents Setup Fulfillment
Query Propagation EPC, URL, serviceType EPC, URL QueryClient Intermediarye.g. DS Resourcee.g.EPCIS full EPCIS queryClient ID full EPCIS queryClient ID Setup EPCIS result-set Discovery Fulfillment
Query Propagation Meta Client QueryClient Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Intermediarye.g. DS Resourcee.g.EPCIS Setup Discovery Setup Fulfillment Discovery Fulfillment One-off queries Standing queries Directory of Resources Notification of Resources EPC, URL, serviceType EPC, Client ID Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Setup Setup Discovery Discovery Fulfillment Fulfillment EPC, URL, serviceType EPCIS query, Client ID
Directory Service Directory of Resources Notification of Resources EPC, URL, serviceType EPC, Client ID Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Setup Setup Discovery Discovery Fulfillment Fulfillment (one-off queries) (standing queries) • 'immediate' response - resources do not need to be available at time of DS query • DS does not need to maintain session information for each DS query • client can choose which links to follow and can adjust its query for each resource • DS needs to store and enforce access control policies on behalf of resourcesin addition to the access control mechanism that each resource provides • resources need to trust the DS operator to enforce their policies on their behalf • resource does not find out client ID until client chooses to make an EPCIS query
Directory Service QueryClient Who has infoabout EPC 123 ? DS 456 EPCIS 1 123 EPCIS 2 123 EPCIS 3 EPCIS 1 EPCIS 2 EPCIS 3
Query Propagation Meta Client QueryClient Intermediarye.g. DS Resourcee.g.EPCIS QueryClient Intermediarye.g. DS Resourcee.g.EPCIS Setup Discovery Setup Fulfillment Discovery Fulfillment Query Relay EPC, URL, serviceType EPCIS query, Client ID (one-off queries) (standing queries) • DS does not return any resource data to clients • DS propagates client query to resources and client must wait for them to respond • The same client query will be sent to all resources - and the client has no visibility nor control over which publishers will receive their query (all receive the same query) - but has the convenience of not having to make iterative follow-up EPCIS queries • DS does not store fine-grained access control policies on behalf of resources - access control is done by each resource independently, with knowledge of client ID • Each resource can log all (successful/failed) attempts to access their data • Resources can deny access to certain clients without making client aware of denial
Query Relay QueryClient Who has infoabout EPC 123 ? DS 456 EPCIS 1 123 EPCIS 2 123 EPCIS 3 EPCIS 1 EPCIS 2 EPCIS 3
Routing asynchronous replies back to clients • Some models require a DS or a resource (e.g. EPCIS) to respond asynchronously to a client. • Client might specify a return address of a listener or client proxy that is reachable (e.g. in DMZ, not behind firewall) • In some models, the response does not come from the DS but from an unexpected / unknown resource • Will need mutual authentication + establishment of mutual trust • Routing resource responses back via a DS may be an option • allows consolidation of responses from resources • allows decoupling of client / resource address info • DS may / may not maintain state / session info (# expected replies) • burden of maintaining client session information adds to complexity and also adds to scalability problems
Clients receiving replies - problems of slow / withheld responses • Clients may have to receive and combine/correlate independent responses from multiple resources • Problems: • Will client know how many replies to expect for each DS query? • Were all resources willing to reply to the client? • Will some resources be slow to respond? • How should a resource respond (without revealing its ID) when it has information and may be willing to co-operate but still needs further client credentials / justification / negotiation with client? • Possible options: • DS might return to client # of resources the DS forwarded query to. Risky? • Use timeout intervals for receipt of responses. But may miss slow replies. • Resource might send an opaque token to client via a DS, with DS acting as a go-between to help facilitate initial negotiations between client and resource (by passing messages)
Analysis of design options - Security & Trust Confidentiality of Client queries • In Query Relay model, risk of 'harvesting' by rogue resources ('honeypots'). • Clients may need to check plausibility of records asserted by resources. (e.g. an object cannot be within physical custody of two organizations at the same time) • Might need 'business step' to understand whether physical custody is being claimed. (to allow for non-custodial resource owners [e.g. insurers ]) • Blacklists and whitelists • Sent by client with client's query, possibly cached in DS • Use of blacklists to prevent relaying of queries to known competitors or dubious resources • Use of whitelists to restrict forwarding to only a set of trusted business partners. • May prevent client from discovering unknown yet trustworthy resources that hold relevant information
Analysis of design options - Security & Trust Confidentiality of Resource information • In Directory Service model, release of records (links) to client should be controlled via access control policies specified by the resource owner and enforced by the DS operator • DS operator may also (need to) specify and enforce overall policies for their DS (e.g. who can query, who can publish, regulators' policies) that over-ride individual policies specified by resource owners. • Resource owners should be aware of these DS policies before publishing to it • In Directory Service model, scalability & management of security policies is a major concern • If resource policy hides resource ID from unknown clients, how could those clients begin to negotiate with the resource? • Possible mediation role for DS using temporary token + relaying of messages? • In Query Relay model, resource can decide whether to allow access • Without delegation to a DS • Also taking into account real-time info (e.g. current load on resource) • Query Relay model may work better for unsolicited client communications
QueryClient Overcoming 'deadlock' before trust is established Request for access, Quoting token Opaque token 0A8274B2845EF Discovery Service Request for accessfrom client with ID... "I hold information about EPC xyz- but hide my real ID & contact info from unauthorized / unknown clients" Resourcee.g.EPCIS
Analysis of design options - Security & Trust Information Integrity • Must prevent compromising the integrity of information held within DS • Deletion / change of information only in accordance with security policies • Resource should retain right to modify or delete - but this might be over-ridden by DS policies (e.g. to maintain a journal for regulated supply chains) • Delete => mark as void • Modify/update => mark as void and re-assert • May need to consider digitally signing: • Client queries (signed by Client) • DS records (signed by the resource owner / publisher) • Responses (signed by DS or by resource if responding directly) • Potential problem of embedding URLs within DS records since any modification to the URL may break the original digital signature for each record. • Consider decoupling URL from each DS record - store in 'Resource Profile' instead • May need DS to indicate to the client whether it was able to validate signatures • For signed DS records it received, where the record is not returned in full to a client • Whether or not the underlying DS record was / was not signed, validated / did not validate
Analysis of design options - Security & Trust Service Availability • DS should be designed to be resilient against Denial-of-Service attacks. • The DS design should not compromise the clients or resources or make them more vulnerable to Denial-of-Service attacks. • In Directory Service model, URL of resource is only released to clients fulfilling access policy restrictions - helps prevent attacks on resources • However, resources under attack may need to change their address (URL) • Need ability to decouple current (possibly mutable) URL of a resource from its immutable resource ID - rather than embedding URL within each DS record. (See also previous slide re digitally signed records) • In Query Relay design, if clients rely on propagation of full (EPCIS) query via a query relay DS, they would be particularly dependent on DS availability ( unless they have previously cached URLs of relevant resources )
Discovery Service Decoupling of URLs from Discovery Service records ResourceID=... DS RecordEPC or ID Timestamp ResourceID [other metadata] Resource ProfileURL serviceType ResourceID Resourcee.g.EPCIS
Analysis of design options - Security & Trust Attack Scenarios • Possible misuse of Query Relay design to launch DoS attacks on resources. • Possible countermeasures: • client authentication with DS, • limit how frequently client may make queries • Registration of non-existent resource addresses for already assigned EPCs • Increases network load, slower responses (timeouts, retries) • Query Relay and each client of a Directory Service could identify resources that persistently fail - and remove from resource cache or add them to blacklists • Registration of existent resource addresses - but of incorrect service type • Countermeasure: authenticate resources before allowing them to publish • Impersonation of valid clients by malicious clients to mislead DS or resources • Countermeasure: authentication of DS clients
Analysis of design options - Security & Trust Inter-working with NATs and Firewalls • Clients must be able to interact with a DS from behind a firewall or Network Address Translation (NAT) box. • Stateful firewalls match returning traffic with outbound addresses • Problem of responses from unexpected network addresses (especially in Query Relay model variant when responses are not returned via the DS but directly from resources) • Can also be a problem when sending responses via a message transport network. (address of message router might not be expected/recognized) • Client may need to provide client proxy (listener address) in DMZ for receiving inbound responses, (allow for inspection while quarantined)
Analysis of design options - Security & Trust Management of Access Control Policies • Need high-level policies about which clients / resources can interact with DS • Resources need to be able to restrict which clients can access their information (including the links to their information) • For all models, the underlying resources need an access control mechanism • For the Directory Service model, resources may need to be able to specify fine-grained access control policies to be enforced by the DS without the DS needing to contact the resource to check authorization • May be considered as a subset of access control policy for underlying resource • In Directory Service model, DS holds significant amount of policy state information - but management of DS policy may only be marginally more than management of underlying resource policy • Maybe even possibility of using a common policy language / framework for both • For Query Relay, policies are stored and enforced primarily at each underlying resource, although less granular policies may be pushed to DS / network to reduce load on resources • Directory Service model provides clients some opportunity to avoid 'honeypot' harvesting attack ( by allowing inspection of link before contact )
Analysis of design options - Network Performance / Resilience • Persistent state information on a Directory Service • EPC - resource links (both models) • Security Policies (more detailed for Directory Service model) • Client subscriptions to new DS records (so new resources can be found) • Management • Client subscriptions should be self-managing with automated removal • DS may also provide automated retention management of records (Time-to-live / renewal of lease) • Transient state information on a Discovery Service • Client session information (especially for Query Relay model) (manages correlation of responses from resources with client's query)
Analysis of design options - Network Performance / Resilience Transaction Duration, Transparency & Predictability • Client should receive response with minimum delay / response time • Client should be able to manage communication with DS and resources • Short, predictable response times preferred • Client should be able to detect failed communications • Client should be able to selectively retry only the communications that failed • In Directory Service model, client has maximum control of communications with DS and underlying resources • In Query Relay model, client must wait to ensure that all resources have had sufficient time to respond. • Difficulty in knowing which resources have failed to respond (especially if time interval is too short). • Possibly difficult to selectively retry the communication
Analysis of design options - Network Performance / Resilience Caching to improve performance (within a Discovery Service or by clients) • DS maintains an internal cache of resource availability • May be insufficient to answer an EPCIS query directed at a DS • Client can cache responses from a DS for future use with the same EPCs • Potential problem with Query Propagation model: • Client's cache may be missing potentially relevant resources because a previous client query to Query Relay network was too specific (so only some resources responded to the query) • Possible solution is for a resource to identify itself as a potential resource for a specific EPC, even if it had no results for the more specific query from the client
Analysis of design options - Network Performance / Resilience Processing Load on DS • DS should be able to handle multiple simultaneous client communications • Requests should be handled quickly, with minimal computational effort • Processing load depends on: • matching of client's request to DS records or routing tables. • retrieval and enforcement of applicable security policies • Note about supporting additional metadata fields (e.g. bizStep) • May add to complexity of DS search • May result in more finer-grained security policies • May require post-processing of results to limit visibility of additional metadata
Conclusions • Considered different models for interactions between clients, resources and intermediaries such as Discovery Services • Choice depends on impact on security, performance & scalability • Not necessarily a single solution for all kinds of supply chains • Friendly community supply chains vs strongly competitive vs highly regulated • Directory Service is traditional well-proven approach but has unique challenges as a Discovery Service: • Delegated control and scalable expression, evaluation and enforcement of security policies • Query Relay model perhaps less obvious - but routing networks are established e.g. in peer-to-peer content retrieval networks. • Major challenges are detection and prevention of: • Honeypots for harvesting information from client queries • Injection of false information to mislead or cause disruption • Need secure resource registration and policing of resource behaviour • Need secure client registration and policing to prevent DoS attacks on resources
Further reading and Acknowledgements These slides were prepared for the EPCglobal Data Discovery JRG face-to-face meeting (Alpharetta, June 2008) and are based on section B of deliverable D2.4 from the BRIDGE project: BRIDGE WP02 High Level Design Discovery Services http://www.bridge-project.eu/index.php/public-deliverables/en/ I would like to acknowledge that D2.4 B was jointly authored by: Trevor Burbridge (BT) Oliver Kasten (SAP) Cosmin Condea (SAP) Mark Harrison (University of Cambridge, Auto-ID Lab) with additional inputs from: Nicholas Pauvre (GS1 France) members of AT4 wireless (leader of BRIDGE WP2 [DS] ) A paper from the SAP team within BRIDGE on the Query Relay model appeared in the proceedings of the Internet of Things 2008 conference: http://www.springerlink.com/content/v568wv5751r1187q/ http://dblp.uni-trier.de/rec/bibtex/conf/iot/KurschnerCKT08