1 / 21

Threads

Threads. Tutorial #7 CPSC 261. A thread is a virtual processor. Each thread is provided the illusion that it owns a core Copy of the registers It is running all the time In fact, all of the threads share the hardware cores

chakra
Télécharger la présentation

Threads

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Threads Tutorial #7 CPSC 261

  2. A thread is a virtual processor • Each thread is provided the illusion that it owns a core • Copy of the registers • It is running all the time • In fact, all of the threads share the hardware cores • The operating system rapidly switches the cores among the threads that want to run, providing this illusion that each thread owns a core

  3. POSIX standard: pthreads • Threads are created via pthread_create() • A thread dies when it: • returns from the function given to pthread_create • or calls pthread_exit() • You can wait for a thread to complete via pthread_join()

  4. Visualizing thread execution main thread pthread_create() child threads pthread_join()

  5. PingPong.c void *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) { counter = counter + 1; } return 0; } void main(...) { pthread_create(&t1, NULL, &p, "ping”); pthread_create(&t2, NULL, &p, "pong”); pthread_join(t1, NULL); pthread_join(t2, NULL); }

  6. Questions you might ask • What if t2 finishes before t1? • It will just wait as long as necessary for the main thread to join with it • How many threads can I create? • It depends. On Linux a few 10s of thousands

  7. The big thing that goes wrong • Uncontrolled access to memory • race condition • with multiple writers of a single shared variable, updates can get lost • Fixed by: • single writers • locks

  8. Before-and-after atomicity • Sometimes you need an arbitrary sequence of operations to be atomic • A sequence of operations that needs to be atomic is also called a critical section • Mutual exclusion means only one thread at a time (one thread in the critical section excludes all others)

  9. Achieving mutual exclusion • In pthreads, mutual exclusion is provided by mutex objects • created as all other objects (malloc) • initialized by pthread_mutex_init() • acquired by pthread_mutex_lock() • released by pthread_mutex_unlock()

  10. The lock idiom • Every critical section is protected by a mutex • The code looks like: pthread_mutex_lock(&lock); // critical section pthread_mutex_unlock(&lock);

  11. Locking PingPong void *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) { pthread_mutex_lock(&lock); // Once the lock is held, this // “critical section” can be as long // as you need or want it to be counter = counter + 1; pthread_mutex_unlock(&lock); } return 0; }

  12. Multiple critical sections • If there are multiple critical sections that access the same shared data • They need to be protected by the same lock

  13. Multiple critical section idiom pthread_mutex_lock (&lock); // critical section // for thread 1 pthread_mutex_unlock (&lock); pthread_mutex_lock (&lock); // critical section // for thread 2 pthread_mutex_unlock (&lock);

  14. Locking issues • Fine-grained locks • more parallelism • more overhead • more complexity • Coarse-grained locks • less parallelism • less overhead • simpler • Deadlock

  15. Sample thread code • Lots of examples in the threads directory of the lectures repository

  16. Using threads for speedup perfect speedup 45o line Speedup Cores used

  17. Things to think about • Each core has its own cache • The L3 cache is shared between all the cores • If you have 8 cores, how many “pieces of work” should you create? • 8? • <8? • >8?

  18. More things to think about • What if the “pieces of work” aren’t all the same size? • Or what if one thread is slower than the other threads? • Why could this be? • Randomness • Interference with other activity on the machine • These slow threads are called “stragglers” and are a real problem in practice

  19. The ideal case – all threads finish at the same time

  20. What might happen

  21. Even more things to think about • Suppose I have 8 cores. • Should I create 8 threads – one for each core • Or more than 8 threads to deal with “stragglers” • Or ...

More Related