340 likes | 377 Vues
Introduction to Concurrency: Introduction to Concurrency. Majeed Kassis. What is Concurrency?. The ability to run multiple tasks on a single CPU at the same time. Central to the design of modern operating systems is managing multiple processes: Multi-threading Multi-processing
E N D
Introduction to Concurrency:Introduction to Concurrency MajeedKassis
What is Concurrency? • The ability to run multiple tasks on a single CPU at the same time. • Central to the design of modern operating systems is managing multiple processes: • Multi-threading • Multi-processing • Distributed processing • Central issue is managing Concurrency: • Communication among processes • Sharing of and competing for resources (memory, files, and I/O access) • Synchronization of the activities of multiple processes • Allocation of processor time to processes
Core Terms • Atomic Operation: • A sequence of indivisible statements; executed without interruptions • Critical Section: • A section of code that requires access to shared resource • Cannot be executed concurrently • Mutual Exclusion • The ability to allow single access to a critical section at any given time • Race Condition: • A flaw in the application that occurs if the timing or the ordering of events are important, leading to a wrong result in case the timing or ordering was not enforced • Example: • Two threads incrementing a shared value in a loop. • Condition Variable • A variable allowing threads to block as long as the variable value did not change • Once the variable value changes, then the condition is met, allowing the blocked threads to continue working.
work CPUs/cores CPUs/cores resource Parallel vs. Concurrent • Parallel: Using multiple processing resources(CPUs, cores) at once to solve a problem faster. • Example: A sorting algorithm that has severalthreads each sort part of the array. • Concurrent: Multiple execution flows (e.g. threads)accessing a shared resource at the same time. • Example: Many threads trying to make changes tothe same data structure (a global list, map, etc.).
Concurrency • Unlike parallelism, not always about running faster. • Even a single-CPU, single-core machine may want concurrency. • Useful for: • App responsiveness • Example: Respond to GUI events in one thread while another thread is performing an expensive computation • Processor utilization (mask I/O latency) • If 1 thread is stuck working, others have something else to do • Failure isolation • Convenient structure if want to interleave multiple tasks and do not want an exception in one to stop the other
Concurrency is done by Time Slicing • Time Slicing: • Many threads want to access a single shared resource • Instead of giving each time its time to finish working in a queue • We assign a “quantum” of time on which each thread may use. • Once quantum is used up, the next thread gains access • We only ensure time fairness, not performance fairness! • If a given piece of code is run by two threads at once: • The order of which thread gets to run first is unpredictable. • How many statements of one thread run before the other thread runs some of its own statements is unpredictable.
Possible Solutions • Data Corruption • The use of mutual exclusion mechanisms on critical sections prevents data corruption. • Causes performance slowdowns! • Adds complexity! • Communication • Prevent racing condition by using conditional variables. • The use of conditional variables enables partial ordering in a concurrent system. • Causes performance slowdowns! • Adds complexity!
Concerns in Concurrent Systems • Data Corruption • Due to the nature of concurrent systems, the shared data may get corrupted. • Communication • Communication between tasks is needed to avoid conflicts. • Runtime • Bad design or implementation of the system may cause slowdowns. • Complexity • Concurrent systems are more complex than sequential ones.
Possible Solutions - Continued • Runtime • Minimize the use of mutual exclusion mechanisms. • Minimize the size of critical sections. • Complexity • A big problem due to the use of mutual exclusion mechanisms. • Lock-free alternatives?
Concurrency in Distributed Systems MajeedKassis
What is a distributed system? • A distributed system is a collection of independent computers that appear to the users of the system as a single system. • They communicate by using a network by passing messages. • Successful distributed system requires: • Networking • Processors • Memory and Storage • Communication Protocols • Without the advancementof technology in these five categories, we would not have distributed systems today.
Advancements of Technology: Networking • Robert Metcalfe presents the concept of Ethernet at the National Computer Conference in 1976. • Ethernet introduced as networking standard in 1980.
Ethernet Progress Throughout the Years 1980: 2.94Mbps – 367kB/s 1982: 10Mbps - 1.25MB/s (thick cable) 1985: 10Mbps – 1.25MB/s (thin cable) 1995: 100Mbps – 12.5MB/s 1999: 1Gbps – 125MB/s 2001: 10Gbps – 1250MB/s 2010: 40Gbps/100Gbps 2017: 400Gbps
Advancements of Technology: Networking • Wi-Fi - Wireless Fidelity • It is a wireless technology that uses radio frequency to transmit data through the air • 1985: • 1990: IEEE established 802.11 group • 1997: 802.11Legacy, 1/2Mbit/sec, 2.4Ghz • 1999: 802.11a, 6-54Mbit/sec, 5Ghz • 2003: 802.11g, 6-54Mbit/sec, 2.4Ghz • 2009: 802.11n, 65Mbit/135Mbit/sec 2.4Ghz/5Ghz • First MIMO (multiple-input/multiple-output) streams! • This allows up to 4x4 connectivity! • 2013: 802.11ac, 866.7Mbit/sec max, 5Ghz • 8x8 MIMO • 2016: 802.11ad, 7Gbit/sec, 60Ghz
Wi-Fi Advantages • Freedom • You can work from any location that you can get a signal. • Setup Cost • No cabling required. • Flexibility • Quick and easy to setup in temp or permanent space. • Scalable • Can be expanded with growth. • Mobile Access • Can access the network on the move.
Side effect? Network Size! • 1985: • Large companies and universities • Consumers using dial-up • Gateways between networks • 1961 hosts online! • 2017: • One large internet • Widespread connectivity • High speed • Over 1 billion hosts online!
Side effect? Communication speed! • 1985: • Streaming impractical • Downloading one song will take ~15minues using 44Kbps modem • Today: • Streaming high quality songs is trivial. Google Music, Apply Music, Spotify, etc • Downloading one song will take 0.4seconds using 100Mbps connection • Streaming 4K Netflix video is instantaneous.
Advancements of Technology: Processing Power • CPUs: • Smaller • Cheaper • Power efficient • Faster • Microprocessors became the core of all systems today. • Intel x86 architecture for heaving computing • 15Watt – i7 7600U - Ultrabooks • 65Watt – i7 7700 – Desktops • 91Watt - i7 7700K – Faster desktops • ARM architecture for lower power mobility • CPU – 2Watt • GPU – 2WAtt • MMU – 0.5Watt
History of CPUs – Consumer Desktop CPUs • 1977: 8080, 2Mhz, 6K transistors • 1985: 386DX, 33Mhz, 275K transistors • 1995: Pentium Pro, 200Mhz, 5.5M transistors • 2003: Pentium 4, 3.8Ghz, 125M transistors – Fastest CPU in Mhz • 2005: Pentium D, 3.7Ghz, 169M transistors – First dual core CPU • 2008: i7 920, 2.67Ghz, 731M transistors – 4 core CPU • 2017: i7 7700K, 4.2Ghz, 5.2B transistors – 4 core CPU • AMD: Ryzen 1800X, 3.6Ghz, 4.8B transistors – 8 core CPU
Advancement of Technology: Storage • 1975: 5.25” Floppy Disk • 1981: 3.5” Floppy Disk • 1983: 3.5” HDD • 1984: CD-Rom • 1988: 2.5” HDD • 1995: DVD-Rom • 1999: Flash Memory • 2003: Blu-Ray
Side effect? • Storage costs reduced drastically: • 1985: 50GB would cost $511,640 using over 1279 drives • Today: 4TB HDD costs $120 • This allowed many companies to massively increase in size base of its customers, mainly customer based content creation websites
Advancement of Technology: Protocols • Base network protocols developed over the years: • HTTP – application layer • TCP – transport layer • IP – internet layer • Ethernet – network interface • Massive amount of distributed programming APIs developed • We will learn some of these APIs • Non-Blocking IO • AsyncIO /Node • ReactiveX • Reactive REST
Distributed Systems: Advantages • Economical • Can achieve good price/performance radio in contrary to centralized systems • Reliable • Built, by design, to be fault tolerant to hardware failures • Incremental Growth • Allows for performance/storage growth in tiny steps • Uses consumer grade hardware! • Scalable • The system can grow as needed without the need for drastic changes in its architecture
Scalability Examples • Distributed systems scale in contrary to multiprocessor systems. • What does that mean? • Google search engine: • In 1999 it took 1 month to crawl and build an index of about 50 million pages. • In 2012, the same task was accomplished in less than one minute. • A single query uses 1,000 computers in 0.2 seconds to retrieve an answer • Disney’s Cars 2 movie. • Rendering a single frame took 11.5 hours on average. • They used 12,500 cores! • Monsters University required (on average) 29 hours per frame. • Total render time: 100 million CPU hours. • They’ve used 5000 AMD processors over 10Gbps networks.
Distributed Models: Client-Server Model Client sends requests to server. Server is a system that runs a service. The server is always on, and processes requests from clients. Clients do not communicate with other clients. Examples: HTTP, FTP, e-mail
Multi Tier Architecture • A client server architecture • Splits server into three tiers • Each tier can be modified and updated independently of the rest of the tiers! • Example: (right) • 3-Tier architecture for web systems
Peer-to-Peer Model: Napster (Centralized) • 1999: First peer-to-peer application! • Clients directly connected to each other • Peers act as clients and servers. • Peers initiate sessions with each other. • Central server for indexing only. • Drawback? • Advantage? • Can search for files.
Peer-to-Peer: BitTorrent (Decentralized) • No central server! • Each node connected to many other nodes. • Files are divided into small pieces! • Downloading from many users • Uploading to many users • Can be pure peer-to-peer with super-peers acting as trackers. • Drawback? • Cannot search for files • Advantages • Faster downloads • Fast distribution of popular content • Major users of P2P? • Windows Update! • Microsoft Skype • Blizzard Entertainment
Cloud Computing • Resources are provided as an internet service • Software – SaaS • Remotely hosted software • Google Apps, Office 365 • Infrastructure – IaaS • Scalable, high performance virtual machines • Amazon Web Services – AWS, Google Compute Engine • Platforms – PaaS • Scalable platforms to build web applications • Google App Engine, Amazon Elastic Beanstalk • Cloud Storage • Remote file storage • DropBox, Google Drive
Distributed Systems: Design Challenges • Reliability: • Data must never get lost due to failures • Fault tolerance – in case of hardware, software, or network failure • Availability: • Must be online all the time • Reduce latency • Scalability: • Adding more processing power increases performance • Handling geographical distance • Security • Authentication, authorization, encryption
Concerns in Distributed Systems • Concurrency of components • Each component is independent • Has no access to other component resources • How to share data? • How to modify shared data? • Lack of global clock • Event ordering issue • Independent failure of components • Failed component must not affect the system • What happens if you are waiting on another process? • Failed component must not cause data loss