1 / 19

Dr. Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia 20191

Digital Object Architcture An open approach to Information Management on the Net Bibliotheca Alexandrina. Dr. Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia 20191 November 19, 2009. Historically.

bonnie
Télécharger la présentation

Dr. Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia 20191

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Object ArchitctureAn open approach to Information Management on the NetBibliotheca Alexandrina Dr. Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia 20191 November 19, 2009

  2. Historically • The initial challenge was to get different computers to interoperate when they are all on a single network. • Subsequently, the Internet challenge was to get different packet networks to interoperate • And enabling computers on those diverse networks to talk to each other reliably • One initial objective was communicating bits without regard for what the receiver would later do with them • Or for hostile intermediate actors • Mission Accomplished in the mid 1970s • Things have gotten much more sophisticated since then

  3. What is the Internet • It’s a set of protocols and procedures that allow different computers and networks to interoperate. • It links together virtually any packet network, independent of its internal characteristics • The Internet is not itself a network. • Rather it’s a global information system where the information flows allow the different constituent networks to work together.

  4. Trust and Authentication • These are both critical aspects • Bits received can always be checked for correctness using agreed encryption techniques • Certain techniques may be easier to employ • Others may be more efficient • But applications can be corrupted, and systems can be compromised • But even if the underlying application runs (apparently) properly, one needs to be sure that nothing nefarious is going on?

  5. Focusing • I purposely focus in the remainder of these remarks on what one can do to manage information in the Internet environment • I purposely do not address physical threats of the form that destroy capabilities. Surreptitious physical threats that modify capabilities lie in-between • I assume that all system components as well as information in digital form may be viewed as logical entities, of the same genre – known as “digital objects” • And communication between components (including users) is authenticable in a single logical fashion. • Given this, the problem is transformed into two alternate ones • how can the authentication be managed systemically • how can the components protect themselves from information attacks.

  6. Properties of DOs • They are machine independent and portable from platform to platform • Parts of a digital object may be accessed and protected separately from the object as a whole • Authentication of a DO may be enabled by using fingerprints of one or more parts of a DO • Which enables portability of such objects in many situations.

  7. What are Digital Objects • If you can’t uniquely identify a digital object, it doesn’t qualify as a digital object • Its not the same as a name, its more like the object’s dna. You can exist without a name, but not without your dna • Like dna, the identifier must be a part of the digital object • A digital object (DO) is defined as “structured data”, that is machine parsable, and which contains a unique persistent identifier.

  8. Is that all there is to a DO? • In one sense, yes. In another sense, no. • An important part of a DO is what I call the payload. When one accesses a DO, the payload is normally what is wanted • But a DO will generally have associated with it additional information, known as metadata, that provides state information about the DO. • Some of the metadata is always part of the payload • Some (or even all) of the metadata may be stored apart from the payload or even duplicated there • And a part of the metadata may be transaction information referencing the use of that digital object.

  9. Finding DOs • In many cases, one may know the identity of a DO a priori or even its location; in other cases, one may only know properties or characteristics of a DO and must rely on that knowledge to find it. • Search engines find web pages on the Internet by crawling the Web; but many computers, applications and systems are not available for a “public crawl” • But they can be characterized explicitly by their owners, managers or creators or with their permission • Systems that provide this information are called Metadata Registries. At a minimum, such registries respond to queries by returning the digital object identifiers, usually in a presentation format that can be visualized by a user. • Within the government, a good example of a metadata registry is ADL-R, created for the Advanced Distributed Learning Initiative in the Pentagon.

  10. Metadata Registries • Are generally used for searching, browsing or creating collections of information • They do not track operational details • The identifiers they return may be “resolved” to determine the relevant state information via a “resolution system” • We call these identifiers “handles” • The Handle System is the pre-eminent system for resolving digital object identifiers

  11. Handle SystemA general purpose resolution system • A detailed description is at www.handle.net • It has been operational on the net since 1994 and is in widespread use in many applications • Software may be downloaded from the net and users can run their own local handle services • Resolution of a handle produces a “handle record” which contains state information needed for immediate decision making or action • For example, the state information may contain • One or more IP Addresses • Terms and Conditions for access • Public Keys • Authentication information to validate the object itself

  12. Repositories • Repositories provide access to digital objects • A repository may be a housed in a physical location, or it may be a mobile program. • Communication with a Repository is via the digital object protocol which supports • Access to DOs based on handles • Authentication in both directions.

  13. Repository Notion Digital Object Manager Storage System Digital Object Protocol REPOSITORY

  14. DO Repository Server Software • Takes inputs based on identifiers and returns digital objects • Connects to existing and older legacy systems • Based on an open architecture • Achieves interoperability with other repository systems that support the protocol • Can provide additional application dependent functionality, if desired, by depositing executable digital objects

  15. Specific Interface Capabilities • Standard Interface is at a “meta level” • Allows new functionality to be added by defining new digital objects • Supports Authentication of Users and Services • Provides object level protection

  16. Extensible Interface <input sequence><H1> <H2> <Parameters> <output sequence> Where H1 is a handle for the operation to be applied to the Target DO H2. Similarly both A and B are known by their Handles HA and HB. The steps of the protocol are Establish a connection from A to B {Optionally} A asks B to authenticate himself If successful, A provides an input string to B {Optionally} B asks A to authenticate herself B provides the results of the operation Either party may choose to continue or close

  17. Displaced Vulnerabilities? • The Handle System can be attacked • But its fully distributed, can be replicated • And can be locally protected from external unauthorized intrusions so external actions won’t affect local usage • Private Keys can be lost • But revocation will prevent continued damage • And replication of digital objects can mitigate against corruption of information

  18. Vulnerabilities (cont’d) • Registries can be corrupted, or access denied to authorized users due to hostile action. Replication of registries is one solution to this problem. • Repositories may be corrupted and produce the wrong information. One must take care where one trusts the deposit of information, just as one must take care in depositing other assets in, say, banks

  19. Bottom Line • This approach allows for digital information to be managed effectively over very long as well as very short time frames • All the architectural components have well defined open interfaces, protocols and returned objects which will stand the test of time. • The architecture allows investment into creating of digital information to be made once and easily ported from technology base to technology base. • The modular nature of the architecture allows the system to be managed component by component

More Related