1 / 16

Creating a simplified global unique file catalogue

Creating a simplified global unique file catalogue . Miguel Martinez Pedreira Pablo Saiz. Motivations. Reaching memory limits in the actual catalogue database machine(s) Catalogue is now ~700 Gigabytes Some tables are too large (in G#L and L#L) e.g. G132L – 55 GB, tried to split

brac
Télécharger la présentation

Creating a simplified global unique file catalogue

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating a simplified global unique file catalogue Miguel Martinez Pedreira Pablo Saiz

  2. Motivations • Reaching memory limits in the actual catalogue database machine(s) • Catalogue is now ~700 Gigabytes • Some tables are too large (in G#L and L#L) • e.g. G132L – 55 GB, tried to split • Conversion old-catalogue to new-catalogue takes too long • Between 15 and 20 hours • Stuck on some of these large tables • Several mysql configurations (threads, memory, pools...) • Tried on alientest03 and alientest07 • Helped to find things that shouldn’t be there • In search of better performance

  3. Motivations • Tendency points to even bigger increase

  4. Previous status • Catalogue implementing several-host database • In order to separate the database into different machines • Never used in practice: users and data in the same host • Hosts Index • No referential integrity

  5. Current status (production) • Only references between G#L_REF and G#L_PFN with G#L, and them with SE • Found some entries that shouldn’t be there • e.g. PFNs with GUIDs that don’t exist • some unused or temporary tables (old functionalities, L#L without _s) • ~250L tables (x3) • ~100 G tables (x4)

  6. Improvements • Complete referential integrity: FKs all around • consistent database • InnoDB tables • row level locking • using ids (instead of text fields) for referencing • normalized data, surrogated keys • Cleaning tables, renaming fields • Fixing some keys • Lead to v2-20 version (currently in PANDA)

  7. x 100 x 250 Link to documentation!

  8. How to improve • New (Pablo’s) idea: Filesystem catalogue • No database, instead the catalogue is a UNIX-like filesystem • Using metadata files to keep the necessary information • aggregating this information was quite similar to having the DB itself... • Tried/discussed about different available filesystems • dealing with inodes, file limits, performance... • Performance was measured • unfortunately, it was too slow • What else? -> guidless catalogue

  9. GUIDs • GUID = Global Unique IDentifier • First decided to have GUIDs, because it was expected that LFNs would change the name very often, and GUIDs were something static • They give flexibility and functionalities • Very nice for mirroring • Permissions per LFN and per GUID • e.g. for having reduced permissions on links • Being used for the quotas too • BUT: not really used in practice, and adds quite a lot of complexity compared to the benefits PFN LFN GUID PFN LFN PFN

  10. Deleting GUIDs • We plan to use the LFN combined with the timestamp to maintain its uniqueness • So you can differ between them when they are deleted and a new LFN with the same name is registered • File naming for storage in SEs was based on GUIDs • now is LFN+timestamp: more human-readable • Simplicity • 240GB directly gone (~35% DB)

  11. Links • At this moment we have the ‘special’ PFN ‘guid://’ for handling links • A link is a LFN pointing to other LFN • Using GUIDs is more flexible (- consistent) • Our design implies pointing inside the same index: but is the use case in production (job outputs and archives). • This change helps us to get rid of 65% of the PFN entries-> 65% of 160GB = 100GB • Slightly different from UNIX symlinks, AliEn links point to part of an archive, not the full file

  12. x 100 x 250 Link to documentation! linkId

  13. Expectations • GUIDs gone, LFNs stay the same, PFN significantly reduced • if there was consistency, GUIDs should be below LFNs • 35% PFNs (350000) should be similar to the number of stored files in SEs (310000)

  14. Expectations • We get rid of ~50% of the catalogue in terms of size (240GB from GUIDs + 100GB from links) • Performance based on all the changes expected to really increase • We’ve had good experience with the TaskQueue • Looks like simple changes from a high-level view • but it involves a lot of effort to check and adapt the code • GUIDs everywhere  • Found old, miss working, improvable code • When? • May-Jun all tests working (AliEntest set) • ALICE_TEST prepared in July to ‘really’ test

  15. Testing • Desired a production-like environment • Not only for the new catalogue, but any change in AliEn • Going back to ALICE_TEST • Already had first stages of trains and standard jobs running • We want to synchronize the test catalogue with the production one • And to execute production jobs • We know is not that easy and beautiful (e.g. API) • Aiming to have newer versions of AliEn in production • In a ‘safe’ way • Test cases are very hard to cover

  16. To sum up • File Catalogue is getting unmaintainable • Is a critical part in AliEn • Some improvements implemented, we need more • Moving to a GUIDless catalogue • Planning to have production-like testing • and newer versions easier

More Related