1 / 23

Session 2 Wharton Summer Tech Camp

Session 2 Wharton Summer Tech Camp. 1: Basic Python 2: Regex. Announcement. If you did not get an email from me saying that the slides have been uploaded, please email me and I’ll add you to the list. Why ?.

early
Télécharger la présentation

Session 2 Wharton Summer Tech Camp

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 2Wharton Summer Tech Camp 1: Basic Python 2: Regex

  2. Announcement If you didnot get an email from me saying that the slides have been uploaded, please email me and I’ll add you to the list

  3. Why ? • Has many great packages useful for us (Scientific computing, Machine Learning, NLP, Scraping etc) • One of the easiest and concise language yet powerful • Memory consumption was often "better than Java and not much worse than C or C++” • Has IDLE ("Interactive DeveLopmentEnvironment") • Read-Eval-Print-Loop • It’s very similar to R • Great OOP (Compared to other comparable languages, say PERL. bless() those who use it) • Highly scalable • Easy incorporation of other languages (Cython, Jython). Wrappers. • Named after Monty Python  Used by many companies as prototyping and "duct-tape" language as well as the main language: Wall Street, Con Edison, Yahoo, CERN, NASA, Google, etc. Also Youtube and Dropboxis written in Python!

  4. Bit More Background on Python • Does few things EXCELLENTLY (OOP, Sci Comp, etc) and is generally good for lot of things • Guido van Rossum – late 1980s • Programmer oriented (easy to write and read). Use of white space. • Automatic memory management • Can be interpreted or compiled (PyPy – Just-in-time compiler) • Direct opposite of PERL when it comes to programming philosophy • PERL "there is more than one way to do it" -> Super fun when writing your own code. Rage when you debug other people’s PERL code (there is even a contest Obfuscated PERL) • Python "there should be one—and preferably only one—obvious way to do it" -> Writing your own & Reading others’ = Fun • Would you like to know more? • http://www.youtube.com/watch?v=ugqu10JV7dk • Van Rossumtalksabouthistory of python for 110 min!

  5. Editor for Python • Recommendation • idlex – more advanced IDLE. • http://idlex.sourceforge.net/download.html • Spyderand Canopy IDE also has some good reviews • IDEs are usually heavy • Great for big projects and professional developers but for simple scripting, I’d stick with idle/idlex • If you want to feel like a badass programmer/hacker, you can learn to use EMACS or VIM editor. You have to learn to use them.

  6. Installing Packages for Python • Enthought distribution includes many packages but you will need to download additional packages later on. • Easy_Install • https://pypi.python.org/pypi/setuptools/0.9.8#installing-and-using-setuptools • Pip • http://www.pip-installer.org/en/latest/installing.html# • These are equivalent to “install.packages()” in R • Mac users • Open up a terminal and type either and see if you have that • If you do, you can automatically download and install python packages using • sudoeasy_installpackagename • sudo pip install packagename

  7. Let’s start coding in Python! • I’ll quickly go through basic ideas • Don’t try to get everything in the first pass • Try to just get overarching theme here • Point is to get exposed to this multiple times before it settles in • It’s good to get an overview when you first learn it before you jump into the tutorial • Followed by 10 min in-class lab with Q&A • Go home and do extensive interactive tutorial Fire up your IDLE(X). Load the file called basicpython.py from the camp website

  8. Basic Data Types • All the standard types • Integers, floating • 2, 2.2, 3.14 etc • Strings • “Hi, I am a string” • Booleans • True • False

  9. Hello World & Arithmetic Helloworld.py >>> print "hello, world!" #that's it # <- used for commenting Simple Arithmetic (+ - * ** / %) >>> 1+1 >>> 5**2 Booleans (operators: and, or, not, >, <, <=, ==, !=, etc) >>> True >>> False

  10. Strings string="hello"; string+string string*3 string[0] string[-1] string[1:4] len(string)

  11. Lists, Tuples, and Dictionaries Data structures – there are many but 4 most commonly used. Each has pros and cons. • List – list of values • Sets – set(list). You can do set operations which can be faster than going through list element one at a time. • Tuples – just like list but not mutable and fixed size. Also, style-wise, list usually consist of homogeneous stuff while tuples can consist of heterogeneous stuff and make a some sort of structure. (firstname, lastname) (name, age) • Dictionaries – Hash look up table. Index of stuff. Basic book keeping "Key->Value". Fast look up O(1).

  12. Lists, Tuples, and Dictionaries • List – [] >>> TPlayersList=["Federer","Nadal","Murray", "Djokovic"] range(), append(),pop(),insert(),reverse(),sort() e.g. TPlayersList.sort() • Tuples – () >>> TPlayersTuple=("Federer","Nadal","Murray", "Djokovic") • Dictionaries – {} >>> TPlayersDict={ "Federer": 5, "Nadal": 4, "Murray":2, "Djokovic":1} >>>TPlayersDict["Ferrer"]=3 >>>TPlayersDict["Ferrer"] >>>del TPlayersDict["Ferrer"] let d be a dictionary then d.keys(), d.values(), d.items()

  13. Lists, Tuples, and Dictionaries • When you are first reading in Data • Think carefully about what you want to do with the data • Then decide what data structures to use • It is common to have things like • List of List • Listof tuples - e.g., list of names • Dictionary of List – e.g., ID-> items bought • Dictionary of dictionaries – e.g., ID-> View -> Product_id – e.g., ID-> Purchase-> Product_id • Dictionary made of (tuple keys) • However, once you need things like dictionary of dictionary of dictionary of List or similar ridiculous structures, consider using object-oriented programming • Look up python Classes (http://docs.python.org/2/tutorial/classes.html)

  14. Basic Control Flow • Boils down to • If (elif, else) • While • For • Python has better syntactic sugar for control flow to iterate through different data structure

  15. Basic Control Flow • True Things • True • Any non-zero numbers • Any non-empty string or data structure • False Things • False • 0 • “” • Empty data structures

  16. If and while if True: print "everything is good” else: print "?! HUHHHHH?" i=1 while (i<=5): print "Hellodoctornamecontinueyesterdaytomorrow" i+=1 if i>5: print "good morning dr. chandra"

  17. Basic Control Flow - for for player in TPlayersList: print player for player in sorted(TPlayersList): print player for index, player in enumerate(TPlayersList): print index, player for i in xrange(1,10,2): print i for key, value in TPlayersDict.iteritems(): print key, value

  18. continue and break • While running loops, you may need to skip or stop at some point, look up • continue • break

  19. Defining a function def fib(n): # write Fibonacci series up to n """Print a Fibonacci series up to n.""" a, b = 0, 1 while a < n: print a, a, b = b, a+b

  20. Importing Libraries • Import library • E.g. “import sys” • Some useful libraries • sys • re • csv • scipy • numpy • http://wiki.python.org/moin/UsefulModules#Useful_Modules.2C_Packages_and_Libraries

  21. File IO • Reading data files into the memory • open() – returns a file object which can read or write files • open(filename, mode) • filehandle= open(filename, mode) • filehandle.readline() Mode • r= read w=write a=append rb=read in binary (windows makes that distinction)

  22. Python Example 1 • Reading a CSV and saving each row in a list • Dealing with CSV can be very painful. • Sometimes different character encoding causes problem when reading csv • If CSV reading just doesn’t work, suspect that you have an encoding issue. Look up encodings (ISO-8859-1/latin1 to UTF-8) • This is why no serious programs really use csv as a storage mechanism • Fire up csvRead.py

  23. Lab Do Interactive tutorials at home at http://www.codecademy.com/en/tracks/python http://www.learnpython.org/ For now, do this for 10 minutes

More Related