520 likes | 718 Vues
第五章 数据管理. 龚 斌 山东大学计算机科学与技术学院 山东省高性能计算中心. Globus 数据管理服务. http://www.globus.org/. 数据管理服务. 数据传和访问 GASS: Provides services mainly intended for use with GRAM (file staging, I/O redirection) GridFTP: Provides high-performance, reliable data transfer for modern WANs 数据复制和管理
 
                
                E N D
第五章 数据管理 龚 斌 山东大学计算机科学与技术学院 山东省高性能计算中心
Globus数据管理服务 http://www.globus.org/
数据管理服务 • 数据传和访问 • GASS: Provides services mainly intended for use with GRAM (file staging, I/O redirection) • GridFTP: Provides high-performance, reliable data transfer for modern WANs • 数据复制和管理 • 复制目录: Provides a catalog service for keeping track of replicated datasets • 复制管理: Provides services for creating and managing replicated datasets
公共数据访问协议目的 • 已存在的分布数据存储系统 • DPSS, HPSS: focus on high-performance access, utilize parallel data transfer, striping • DFS: focus on high-volume usage, dataset replication, local caching • SRB: connects heterogeneous data collections, uniform client interface, metadata queries • 问题( 协议不兼容、不公开,各自独立的客户端) • 不兼容的协议和特性 • Each require custom client • Partitions available data sets and storage devices • 每一个协议有所希望功能的一部分
需要公共安全有效的数据访问协议 • 公共可扩展的传输协议 • 意味着可以互操作 • 从存储服务中分离低级的传输机制 • 优点: • 新的特殊的存储系统可以自动地和已经存在系统兼容 • 已经存在系统具有丰富的数据传输功能 • 和多个存储系统接口 • HPSS, DPSS, file systems • Plan for SRB integration
Globus提出统一GridFTP协议 • 基于FTP • 与大量已有工具兼容 • 已经支持不少数据网格需要的特征,容易扩展 • 普遍认可、理解与支持 • 已有的规范 • RFC 949: File Transfer Protocol • RFC 2228: FTP Security Extensions • RFC 2389: Feature Negotiation for the File Transfer Protocol • GridFTP包括什么? • 协议 • 协议实现的一套工具 • GridFTP > FTP,是FTP的超集,GridFTP不仅仅限于文件传输
GridFTP: 基本方法 • 从最通用的子集开始 • Standard FTP: get/put etc., 3rd-party transfer • 实现标准化但不经常使用的特性 • GSS binding, extended directory listing, simple restart • 多个方面的扩展,但保持与已存在服务器互操作能力 • Striped/parallel data channels, partial file, automatic & manual TCP buffer setting, progress monitoring, extended restart
GridFTP的特点 • 互操作,可扩展 • 两个分离 • 底层数据传输机制和数据存储服务分开 • 将控制通道和数据通道分离 • 继承FTP的通用性和广泛性 • FTP is defined by several IETF(Internet Engineering Task Force)RFCs
GridFtp对FTP的扩展 • 安全GSI (Public-key-based Grid Security Infrastructure ) or Kerberos support (via GSS-API) • GridFTP provides this capability by implementing the GSSAPI authentication mechanisms defined by RFC 2228, “FTP Security Extensions”. • 第三方控制数据传输 • a “third-party” user or application at one site to initiate, monitor and control a data transfer operation between two other “parties” • 并行数据传输 • using multiple TCP streams in parallel (even between the same source and destination) can improve aggregate bandwidth over using a single TCP stream.
GridFtp对FTP的扩展 • 分片数据传输 • use multiple TCP streams to transfer data that is partitioned among multiple servers. Striped transfers provide further bandwidth improvements over those achieved with parallel transfers. • 部分文件传输 • transferring portions of files rather than complete files. • 自动协商设置 TCP buffer/window的大小 • Using optimal settings for TCP buffer/window sizes can have a dramatic impact on data transfer performance. • 数据可靠传输 • Fault recovery methods for handling transient network failures, server outages • restarting failed transfers
GridFTP协议实现 • globus_ftp_control_library • control channel API • managing a GridFTP connection, including authentication, creation of control and data channels, and reading and writing data over data channels • separate control and data channels • globus_ftp_client_library • GridFTP client API(provides higher-level client features on top of the globus_ftp_control library) • complete file get and put operations • calls to set the level of parallelism for parallel data transfers • partial file transfer operations • third-party transfers • set TCP buffer sizes.
GridFTP at SC’2000: Long-Running Dallas-Chicago Transfer SciNet Power Failure Other demos starting up (Congestion) Parallelism Increases (Demos) DNS Problems Transition between files (not zero due to averaging) Backbone problems on the SC Floor
Control Control Control Control Plug-in Plug-in Plug-in Plug-in Striped GridFTP Server GridFTPclient To Client or Another Striped GridFTP Server GridFTP Control Channel GridFTP Data Channels mpirun GridFTP Server Parallel Backend GridFTPserver master MPI (Comm_World) … Control socket MPI (Sub-Comm) MPI-IO Parallel File System (e.g. PVFS, PFS, etc.) …
Striped servers at source location 8 Striped servers at destination location 8 Maximum simultaneous TCP streams per server 4 Maximum simultaneous TCP streams overall 32 Peak transfer rate over 0.1 seconds 1.55 Gbits/sec Peak transfer rate over 5 seconds 1.03 Gbits/sec Sustained transfer rate over 1 hour 512.9 Mbits/sec Total data transferred in 1 hour 230.8 Gbytes 测试结果
GASS (Global Access to Secondary Storage) • GASS:Global Access to Secondary Storage ,是Globus Toolkit的一部分 • Remote I/O and Staging • GRAM可以通过GASS获取远地可执行程序 • 从远程访问文件 • 建立远程位置与stdin/stdout/stderr的联系
Global Access to Secondary Storage (a) GASS file access API • Replace open/close with globus_gass_open/close; read/write calls can then proceed directly (b) RSL extensions • URLs used to name executables, stdout, stderr (c) Remote cache management utility (d) Low-level APIs for specialized behaviors
GASS Architecture &(executable=https://…) main( ) { fd = globus_gass_open(…) … read(fd,…) … globus_gass_close(fd) } (b) RSL extensions GRAM GASS Server HTTP Server (a) GASS file access API FTP Server Cache (c) Remote cache management (d) Low-level APIs for customizing cache & GASS server % globus-gass-cache
GASS File Naming • 从程序的角度,GASS的文件打开与关闭函数调用和相应的Unix标准I/O读写函数几乎一样,只是用URL取代了文件名 。 • URL encoding of resource names https://quad.mcs.anl.gov:9991/~bester/myjob protocolserver address file name • Other examples https://pitcairn.mcs.anl.gov/tmp/input_dataset.1 https://pitcairn.mcs.anl.gov:2222/./output_data • supports http & https,ftp & gridftp.
三种访问方式 • Read:读取包含稳定数据的整个文件,将整个文件cache到本地。 • 有可能是多个用户同时进行 • Write:写入单个文件,对本地cache进行操作,文件关闭后才写回到远端。 • 有可能是多个用户同时进行,使最后写的用户生效 • Append:对文件的添加,直接对远端文件的操作, 远端立刻改变。 • 允许多个用户同时操作,但并发写是隔行进行的
File cache • “File open” 将远程文件传输到本地cache • 避免多个本地进程对同一文件的重复打开 • Cache与用户相关,允许用户通过本地资源管理工具对其进行管理 • 程序通过cache API 访问文件cache • 用户可以通过GRAM远程管理cache • 一个用户可拥有多个cache ,每个Cache对应一个条目,记录打开的数目,关闭减少数目,当数目为0,当该cache文件将被删除。
三种安全认证方式 • 普通的匿名ftp、http方式,即没有认证 • 进程之间进行的GSI认证 • 未来还将发展到基于SSL认证的ftp或http访问
Multi-RSL Request Single RSL Single RSL 工作流程 globus-job-run Local Machine Parse DUROC GRAM Client GRAM Client GSI GSI GASS Server Remote Machine Remote Machine GRAM Gatekeeper GRAM JobManager GRAM Gatekeeper GRAM JobManager GSI GSI GASS Client GASS Client App App Nexus Nexus
GASS RSL Extensions • executable, stdin, stdout, stderr can be local files or URLs • executable and stdin loaded into local cache before job begins (on front-end node) • stdout, stderr handled via GASS append mode • Cache cleaned after job completes
GASS/RSL Example &(executable=https://quad:1234/~/myexe) (stdin=https://quad:1234/~/myin) (stdout=/home/bester/output) (stderr=https://quad:1234/dev/stdout)
GASS API • GLOBUS GASS_CACHE API • GLOBUS GASS FILE ACCESS API • GLOBUS GASS SERVER API • GLOBUS GASS SERVER_EZ API • GLOBUS GASS CLIENT API
GASS File Access API • Minimum changes to application • globus_gass_open(), globus_gass_close() • Same as open(), close() but use URLs instead of filenames • Caches URL in case of multiple opens • Return descriptors to files in local cache or sockets to remote server • globus_gass_fopen(), globus_gass_fclose()
no Modified Remove cache reference yes Upload changes globus_gass_open()/close() no URL in cache? Download File into cache yes open cached file,add cache reference globus_gass_close() globus_gass_open()
globus_gass_transfer • Common API for transferring remote files/data over various protocols • http and https currently supported • ftp will be supported in future release • Supports put and get operations on an URL • Allows for efficient transfer to/from files or direct to/from memory • Allows any application to easily add customized file/data service capabilities
globus_gass_copy • Simple API for copying data from a source to a destination • URL used for source and destination • http(s), (gsi)ftp, file • When transferring from ftp to ftp, it uses 3rd party transfer (I.e. client mediated, direct server-to-server transfer) • globus-url-copy program is simple wrapper around the globus_gass_copy API
globus-gass-server • Simple file server • Run by user wherever necessary • Secure https protocol, using GSI • APIs for embedding server into other programs • Example globus-gass-server –r –w -t • -r: Allow files to be read from this server • -w: Allow files to be written to this server • -t: Tilde expand (~/… $(HOME)/…) • -help: For list of all options
globus_gass_server_ez • Very simply API for adding file service to any application • Wrapper around globus_gass_transfer • globusrun uses this module to support executable staging, stdout/err redirection, and remote file access
program GASS server stdout 1 Host name Contact string jobmanager globus-job-run 2 Command Line Args RSL string GRAM & GASS: Putting It Together 1. Derive Contact String 2. Build RSL string 3. Startup GASS server 4. Submit to request 5. Return output 5 5 4 5 5 3 4 4 gatekeeper
GRAM Client GRAM Client GSI GSI Globus Components In Action Local Machine User Proxy Cert X509 User Cert Machines RSL string mpirun grid-proxy-init RSL multi-request globusrun RSL single request RSL parser DUROC GASS Server GRAM Job Manager GRAM Job Manager GRAM Gatekeeper GRAM Gatekeeper GSI GASS Client GASS Client GSI PBS Unix Fork App App Nexus Nexus AIX MPI Solaris MPI Remote Machine Remote Machine
复制管理功能逻辑文件名->物理位置 • 创建/删除复制数据(已有数据集全部或者部分 ) • 注册(新的数据集->Replica Catalog) • 查询(用户/应用程序,特定文件或者文件集合对应的拷贝信息 ) • 选择(最合适的拷贝?信息服务提供存储与网络信息) • 使用复制目录和GridFTP实现数据传输
复制管理方法 • 复制目录(Replica Catalog) • 存放位置、文件以及数据集合信息 • 逻辑文件名到物理文件位置的转换 • 一个文件集合中有哪些逻辑文件 • Replica Management • A set of services for registering files in the replica catalog, publishing files to locations, and • adding/removing replicas at other locations • Locate and select replicas of files • Uses Replica Catalog and GridFTP
Globus Replica Catalog 特点:将数据复制管理和元数据管理分开(降低数据复制管理的难度) 支持不同元数据目录 元数据管理为数据复制提供详细信息的支持 包括的具体功能 注册 Registering a list of files as a logical collection Registering the physical location of a complete or partial replica of a logical collection Registering information about a particular logical file in a logical collection 创建与修改 Modifying the contents of registered entries in the catalog creating new copies of a complete or partial collection of files 查询 Find all physical locations for a set of logical files in a logical collection List all the descriptie attributes associated with a registered logical collection, location or logical file
Replica Catalog的作用 • 跟踪记录一个逻辑文件的多个物理备份,建立从逻辑文件到多个物理文件的映射 • 维护一组逻辑文件名形成的组collection • 定位,从唯一的逻辑文件名到多个物理位置的映射 • 逻辑文件表项,存储单个逻辑文件的信息
Replica Management • Maintain a mapping between logical names for files and collections and one or more physical locations • Important for many applications • Example: CERN HLT data • Multiple petabytes of data per year • Copy of everything at CERN (Tier 0) • Subsets at national centers (Tier 1) • Smaller regional centers (Tier 2) • Individual researchers will have copies
Globus Replica Management • Identify replica cataloging and reliable replication as two fundamental services • Layer on other Grid services: GSI, transport, information service • Use LDAP as catalog format and protocol, for consistency • Use as a building block for other tools • Advantage • These services can be used in a wide variety of situations
Replica Manager Components • Replica catalog definition • LDAP object classes for representing logical-to-physical mappings in an LDAP catalog • Low-level replica catalog API • globus_replica_catalog library • Manipulates replica catalog: add, delete, etc. • High-level reliable replication API • globus_replica_manager library • Combines calls to file transfer operations and calls to low-level API functions: create, destroy, etc.
Replica Catalog Structure: A Climate Modeling Example Replica Catalog Logical Collection C02 measurements 1998 Logical Collection C02 measurements 1999 Filename: Jan 1998 Filename: Feb 1998 … Logical File Parent Location jupiter.isi.edu Location sprite.llnl.gov Filename: Mar 1998 Filename: Jun 1998 Filename: Oct 1998 Protocol: gsiftp UrlConstructor: gsiftp://jupiter.isi.edu/ nfs/v6/climate Filename: Jan 1998 … Filename: Dec 1998 Protocol: ftp UrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi Logical File Jan 1998 Logical File Feb 1998 Size: 1468762
Replica Catalog Servicesas Building Blocks: Examples • Combine with information service to build replica selection services • E.g. “find best replica” using performance info from NWS and MDS • Use of LDAP as common protocol for info and replica services makes this easier • Combine with application managers to build data distribution services • E.g., build new replicas in response to frequent accesses
Relationship to Metadata Catalogs • Metadata services describe data contents • Have defined a simple set of object classes • Must support a variety of metadata catalogs • MCAT being one important example • Others include LDAP catalogs, HDF • Community metadata catalogs • Agree on set of attributes • Produce names needed by replica catalog: • Logical collection name • Logical file name
replica catalog的具体实现 • 目前是用 Lightweight Directory Access Protocol (LDAP)目录实现的 • 以后可能用数据库实现
Replica Catalog Directions • Many data grid applications do not require tight consistency semantics • At any given time, you may not be able to discover all copies • When a new copy is made, it may not be immediately recognized as available • Allows for much more scalable design • Distributed catalogs: local catalogs which maintain their own LFN -> PFN mapping • Soft-state updates as basis for building various configurations of global catalogs