1 / 69

Updating Computer Science Education

Updating Computer Science Education. Jacques Cohen Brandeis University Waltham, MA USA January 2007. Topics. Preliminary remarks Present state of affairs and concerns Objectives of this talk Trends ( hardware, software, networks, others) Illustrative examples Suggestions.

hila
Télécharger la présentation

Updating Computer Science Education

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Updating Computer Science Education Jacques Cohen Brandeis University Waltham, MA USA January 2007

  2. Topics • Preliminary remarks • Present state of affairs and concerns • Objectives of this talk • Trends (hardware, software, networks, others) • Illustrative examples • Suggestions

  3. Present state of affairs and concerns • Huge increase in PC and internet usage. • Decreasing enrollment. (USA mainly)

  4. Possible Reasons • Previous high school preparation • Bubble burst (2000) + outsourcing • Widespread usage of computers by lay persons • Interest in interdisciplinary topics (e.g., biology, business, economics) • Public perception about: What is Computer Science?

  5. The Nature of Computer Science • Two main components: Theoretical and Experimental Mathematics and Engineering What characterizes CS is the notion of Algorithms • Emphasis on the discrete and logic An interdisciplinary approach with other sciences may well revive the interest on the continuous (or use of qualitative reasoning)

  6. Related fields • Sciences in general (scientific computing), • Management, • Psychology (human interaction), • Business, • Communications, • Journalism, • Arts, etc.

  7. The role of Computer Science among other sciences (How we are perceived by the other sciences) • In physics, chemistry, biology, nature is the ultimate umpire. Discovery is paramount • In math and engineering: aesthetics, ease of use, acceptance, permanence, play key roles

  8. Uneasy dialogue with biologists • It is not unusual to hear from a physicist, chemist or biologist: “If computer scientists do not get involved in our field, we will do it ourselves!!” • It looks very likely that the biological sciences (including, of course, neuroscience) will dominate the 21st century

  9. Differences in approaches • Most scientific and creative discoveries proceed in a bottom-up manner • Computer scientists are taught to emphasize top-down approaches • Polya’s “How to solve it” often mentions First specialize then generalize. • Hacking is beautiful (mostly bottom-up)

  10. Objectives • Provide a bird’s eye view of what is happening in CS education (USA) and attempt to make recommendations about possible directions. Hopefully, some of it would be applicable to European universities. Premise • Changes ought to be gradual and depend on resources and time constraints

  11. First we have to observe current trendsGenerality, Storage, Speed, Networks,others. • Trying to make sense of present directions. Difficult and risky to foresee future, e.g., PC (windows, mouse), internet, parallelism • Topics influencing computer science education. • Trends in hardware, software, networks.

  12. Huge volume of data (terabytes and petabytes) • Statistical nature of data • Clustering, classification • Probability and Statistics become increasingly important

  13. Trend towards generality • Need to know more about what is going on in related topics A few examples: • Robotics and mechanical engineering • Hardware, electrical engineering, material science, nanotechnology • Multi-field visualization (e.g., medicine) • Biophysics and bioinformatics

  14. Nature of data structures • Sequences (strings), streams • Trees, DAGs, and Graphs • 3D structures • Emphasis in discrete structures • Neglect of the continuous should be corrected (e.g., use of MatLab)

  15. Trends on data growthHow Much Information Is There In the World? • The 20-terabyte size of the Library of Congress derived by assuming that LC has 20 million books and each requires 1 MB. Of course, LC has much other stuff besides printed text, and this other stuff would take much more space. • From Lesk http://www.lesk.com/mlesk/ksg97/ksg.html

  16. Library of Congress data (cont) 1. Thirteen million photographs, even if compressed to a 1 MB JPG each, would be 13 terabytes. 2. The 4 million maps in the Geography Division might scan to 200 TB. 3. LC has over five hundred thousand movies; at 1 GB each they would be 500 terabytes (most are not full-length color features). 4. Bulkiest might be the 3.5 million sound recordings, which at one audio CD each, would be almost 2,000 TB. This makes the total size of the Library perhaps about 3 petabytes (3,000 terabytes).

  17. How Much Information Is There In the World?

  18. Lesk’s Conclusions • There will be enough disk space and tape storage in the world to store everything people write, say, perform or photograph. For writing this is true already; for the others it is only a year or two away.

  19. Lesk’s Conclusions (cont) • The challenge for librarians and computer scientists is to let us find the information we want in other people's work; and the challenge for the lawyers and economists is to arrange the payment structures so that we are encouraged to use the work of others rather than re-create it.

  20. The huge volume of data implies: • Linearity of algorithms is a must • Emphasis in pattern matching • Increased preprocessing • Different levels of memory transfer rates • Algorithmic incrementality(avoid redoing tasks) • Need of approximate algorithms (optimization) • Distributed computing • Centralized parallelism (Blue Gene, Argonne)

  21. The importance of pattern matching (searches) in large number of items Pattern matching has to be “tolerant” (approximate) Find closest matches (dynamic programming, optimization) • Sequences • Pictures • 3D structures (e.g. proteins) • Sound • Photos • Video

  22. Trends in computer cycles (speed) • Moore’s law appears to be applicable until at least 2020

  23. Use of supercomputers (2006) • Researchers at Los Alamos National Laboratory have set a new world's record by performing the first million-atom computer simulation in biology. Using the "Q Machine" supercomputer, Los Alamos computer scientists have created a molecular simulation of the cell's protein-making structure, the ribosome. The project, simulating 2.64 million atoms in motion, is more than six times larger than any biological simulations performed to date.

  24. Graphical visualization of the simulation of a Ribosome at work

  25. Network transmission speed (Lambda Rail Net) • USA backbone

  26. Trends in Transmission Speed • The High Energy Physics team's demonstration achieved a peak throughput of 151 Gbps and an official mark of 131.6 Gbps beating their previous mark for peak throughput of 101 Gbps by 50 percent.

  27. Trends in Transmission Speed II • The new record data transfer speed is also equivalent to serving 10,000 MPEG2 HDTV movies simultaneously in real time, or transmitting all of the printed content of the Library of Congress in 10 minutes.

  28. Trend in Languages • Importance of scripting and string processing XML, Java C++, Trend towards Python, Matlab, Mathematica • No ideal languages No agreement of what the first language ought to be

  29. A recently proposed language (Fortress 2006) • From Guy Steel, The Fortress Programming Language, Sun Micro-Systemshttp://iic.harvard.edu/documents/steeleLecture2006public.pdf

  30. Fortress Language (Sun, Guy Steele)

  31. Meta-level approach to teaching • Learn 2 or 3 languages and assume that expertise in other languages can be acquired on the fly. • Hopefully, the same will occur in learning a topic in depth. Once in-depth research is taught using a particular area it can be extrapolated to other areas. • Increasing usage of canned programs or data banks Typical examples: GraphViz, WordNet

  32. Trends in Algorithmic Complexity • Overcoming the scare of NP problems (it happened before with undecidability) • 3-SAT lessons • Mapping polynomial problems within NP • Optimization, approximate or random algorithms

  33. Three Examples • Example I The lessons of BLAST (preprocessing, incrementability, approximation) • Example II The importance of analyzing very large networks. (probability, sensors, sociological implications) • Example III Time Series. (data mining, pattern searches, classification)

  34. Example I(History of BLAST)sequence alignment • Biologists matched sequences of nucleotides or aminoacids empirically using Dot Matrices

  35. Dot matrices

  36. No exact matching

  37. Alignment with Gaps

  38. Dynamic Programming Approach

  39. Dynamic Programming complexity O(n2)

  40. Two solutions with gapsComplexity can be exponential for determining all solutions

  41. The BLAST approachcomplexity is almost linear Equivalent Dot Matrices would have the size 3 billion columns(human genome) and Z rows where Z is the size of the sequence being matched against a genome (possibly tens of thousands)

  42. BLAST Tricks • Preprocessing Compile the locations in a genome containing all possible “seeds” (combinations of 6 nucleotides or aminoacids) • Hacking • Follow diagonals as much as possible (Blast strategy) • Use dynamic programming as a last resort

  43. Lots of approximations but a very successful outcome • No multiple solutions • BLAST may not find best matches • The notion of p-values becomes very important (probability of matches in random sequences) • Tuning of the BLAST algorithm parameters • Mixture of hacking and theory • Advantage: satisfies incrementability

  44. Example II (Networks and Sociology)

  45. Money travels (bills)

  46. Probabilities P(time,distance)

  47. Money travels • The entire process could be implemented using sensors. • Mimics spread of disease. • The impact of computing will go deeper into the sciences and spread more into the social sciences (Jon Kleinberg, 2006)

  48. Example III (Time Series) • Illustrates data mining and how much CS can help other sciences Slides from Dr Eamonn Keogh University of California. Riverside,CA

  49. Examples of time series

More Related