1 / 10

High Performance Computing Cluster (HPCC)

High Performance Computing Cluster (HPCC). Mary Galvin Managing Principal, American Innovations Consulting http :// www.aicnova.com https:// www.linkedin.com/pub/mary-galvin/15/340/397. Big Data at LexisNexis. History of the HPCC.

yeo-diaz
Télécharger la présentation

High Performance Computing Cluster (HPCC)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Computing Cluster (HPCC) Mary Galvin Managing Principal, American Innovations Consulting http://www.aicnova.com https://www.linkedin.com/pub/mary-galvin/15/340/397

  2. Big Data at LexisNexis

  3. History of the HPCC Designed and Developed from the Ground-Up to Meet LexisNexis’ Internal Big Data Needs. The Idea of Releasing the HPCC to the OSS Community was Presented to LexisNexis Corporate Management. The Spread of HPCC Users has Gone Global, and as a Result, Innovation Ignites. Google’s MapReduce Paper is Published. 2004 2007 2001 Late 90s/Early 2000s 2009 2011 2012 United States Government Sought After Getting LexisNexis’ Data Capabilities In-House for their Internal Data Mining Needs. The HPCC is Officially Released to the Open Source Community! First Release of Hadoop Available (designed after Map Reduce Papers).

  4. HPCC Architectural Overview

  5. ECL Overview Task: Produce a set of records wherein a particular field contains a specific set of values Typical approach for solving this in many programming languages

  6. ECL Overview (cont’d) Task: Produce a set of records wherein a particular field contains a specific set of values Approach for solving this problem in ECL

  7. HPCC Modules & Plugins • Other • H2H Connector • Machine Learning Module • R Integration • Eclipse IDE • JDBC Driver • …….. • Scalable Automated Linking Technology (SALT) • Data Ingest • Data Profiling • Data Hygiene • Clustering • Relationship Extraction • Exploratory Data Analysis (EDA) Toolkit

  8. HPCC Academic Program • Audience: Colleges and Universities • Benefits: • Internship opportunities • Invitation-only conferences • Free training for qualifying projects • Access to an external cluster, as available

  9. Additional Learning Options • Online: • Includes both prerequisites and tailored courses depending on role type (ie, developers, analysts, and administrators) • http://hpccsystems.com/community/training-videos • In-Person: • http://hpccsystems.com/community/training-events/training

  10. Getting Started

More Related