Basics of HTML Elements and Tags
Learn about HTML tags and elements, create and structure HTML documents, and understand how browsers interpret markup tags for webpage display.
Basics of HTML Elements and Tags
E N D
Presentation Transcript
HTML Darby Tien-Hao Chang Department of Electrical EngineeringNational Cheng Kung University
HTML introduction • HTML stands for Hyper Text Markup Language • An HTML file is a text file containing small markup tags • The markup tags tell the Web browser how to display the page • An HTML file must have an htm or html file extension • An HTML file can be created using a simple text editor
Sample HTML • <html> • <head> • <title>Title of page</title> • </head> • <body> • This is my first homepage. <b>This text is bold</b> • </body> • </html>
HTML elements • HTML tags are used to mark-up HTML elements • HTML tags are surrounded by the two characters < and > • The surrounding characters are called angle brackets • HTML tags normally come in pairs like <b> and </b> • The first tag in a pair is the start tag, the second tag is the end tag • The text between the start and end tags is the element content • HTML tags are not case sensitive, <b> means the same as <B>
Sample HTML • <b>This text is bold</b> • Start tagcontentend tag • <body> • This is my first homepage. <b>This text is bold</b> • </body> • <body bgcolor="red"> • Tag attribute
Basic HTML tags • <html>Defines an HTML document • <body>Defines the document's body • <h1> to <h6>Defines header 1 to header 6 • <p>Defines a paragraph • <br />Inserts a single line break • <hr />Defines a horizontal rule • <!-->Defines a comment
Sample HTML • <html> • <body> • <h1>This is heading 1</h1> • <h2>This is heading 2</h2> • <h3>This is heading 3</h3> • <h4>This is heading 4</h4> • <h5>This is heading 5</h5> • <h6>This is heading 6</h6> • </body> • </html>
Sample HTML • <html> • <body> • <p> • This paragraph • contains a lot of lines • in the source code, • but the browser • ignores it. • </p> • <p> • To break<br>lines<br>in a<br>paragraph,<br>use the br tag. • </p> • </body> • </html>
Sample HTML • <html> • <body> • <h1 align="center">This is heading 1</h1> • <hr /> • <h2 color=“red">This is heading 2</h2> • <!--This comment will not be displayed--> • </body> • </html>
<b>Defines bold text <big>Defines big text <em>Defines emphasized text <i>Defines italic text <small>Defines small text <strong>Defines strong text <sub>Defines subscripted text <sup>Defines superscripted text <ins>Defines inserted text <del>Defines deleted text <code>Defines computer code text <kbd>Defines keyboard text <samp>Defines sample computer code <tt>Defines teletype text <var>Defines a variable <pre>Defines preformatted text <abbr>Defines an abbreviation <acronym>Defines an acronym <address>Defines an address element <bdo>Defines the text direction <blockquote>Defines a long quotation <q>Defines a short quotation <cite>Defines a citation <dfn>Defines a definition term More HTML tags
Haha • s/<[^>]*>//g
Powerful regular expression • s/<[^>]*>//g • s substitute • < left angle bracket • [^>] any character except right angle bracket • [^>]* all characters formed the tag (attributes) • > right angle bracket • g replace globally, i.e. all occurrences
Is semantic important? • Yes, sometimes • To extract the heading of a news article • http://news.yam.com/ettoday/society/200608/20060816189987.html • <h2><span class="red1">發票案/李慧芬週五前返澳 近日將與李碧君對質</span></h2> • /^<h2><span class=“red1”>(.*)<\/span><\/h2>\n$/ • print $1, “\n”;
How to display a less than sign (<) in browser? • Character Entities • A character entity has three parts: an ampersand (&), an entity name or a # and an entity number, and finally a semicolon (;). • To display a less than sign in an HTML document we must write: < or <
HTML links • <html> • <body> • <p> • <a href="lastpage.htm"> • This text</a> is a link to a page on • this Web site. • </p> • <p> • <a href="http://www.microsoft.com/"> • This text</a> is a link to a page on • the World Wide Web. • </p> • </body> • </html>
<html> <frameset cols="25%,50%,25%"> <frame src="frame_a.htm"> <frame src="frame_b.htm"> <frame src="frame_c.htm"> </frameset> </html> <html> <frameset rows="25%,50%,25%"> <frame src="frame_a.htm"> <frame src="frame_b.htm"> <frame src="frame_c.htm"> </frameset> </html> HTML frames
HTML frames • <html> • <frameset rows="50%,50%"> • <frame src="frame_a.htm"> • <frameset cols="25%,75%"> • <frame src="frame_b.htm"> • <frame src="frame_c.htm"> • </frameset> • </frameset> • </html>
HTML tables • <table border="1"> • <tr> • <td>row 1, cell 1</td> • <td>row 1, cell 2</td> • </tr> • <tr> • <td>row 2, cell 1</td> • <td>row 2, cell 2</td> • </tr> • </table>
<html> <body> <h4>Cell that spans two columns:</h4> <table border="1"> <tr> <th>Name</th> <th colspan="2">Telephone</th> </tr> <tr> <td>Bill Gates</td> <td>555 77 854</td> <td>555 77 855</td> </tr> </table> <!-- continued --> <h4>Cell that spans two rows:</h4> <table border="1"> <tr> <th>First Name:</th> <td>Bill Gates</td> </tr> <tr> <th rowspan="2">Telephone:</th> <td>555 77 854</td> </tr> <tr> <td>555 77 855</td> </tr> </table> </body> </html> HTML tables
<html> <body> <h4>An Unordered List:</h4> <ul> <li>Coffee</li> <li>Tea</li> </ul> <h4>An Ordered List:</h4> <ol> <li>Coffee</li> <!-- continued --> <li>Tea</li> </ol> <h4>A Definition List:</h4> <dl> <dt>Coffee</dt> <dd>Black hot drink</dd> <dt>Milk</dt> <dd>White cold drink</dd> </dl> </body> </html> HTML lists
HTML forms • <form> • <input> • <input> • </form> • description: <input type="text" name="name" /> • <input type="radio" name= " name" value="value" />description • <input type="checkbox" name="name" />description • <select name="name"> • <option value="value 1">description 1 • <option value="value 2"> description 2 • </select> • <textarea rows="10" cols="30"> • default text • </textarea>
Form’s action attribute and submit button • <form name="input" action="html_form_action.asp" method="get"> • Username: <input type="text" name="user" /> • <input type="submit" value="Submit" /> • </form>
Methods GET and POST in HTML forms - what's the difference? • http://www.cs.tut.fi/~jkorpela/forms/methods.html • The difference between GET and POST is primarily defined in terms of form data encoding so that former means that form data is to be encoded (by a browser) into a URL while the latter means that the form data is to appear within a message body • If the processing of a form is idempotent (i.e. it has no lasting observable effect on the state of the world), then the form method should be GET • If the service associated with the processing of a form has side effects (for example, modification of a database or subscription to a service), the method should be POST
Exercise • Resolution, number of units, EC no. and so on with a given PDB ID • http://www.pdb.org/ • Today’s headings • Comics • http://jojo.jojohot.com/ • use LWP::Simple; • $web = &get( $url );
Exercise hints • $web =~ /Title\s*<.td>\s*[^>]*>\s*([^\n]+)/
Javascript – a case study • http://proteminer.csie.ntu.edu.tw/
A review of dirtycomi • http://dm.www.wangyou.com/ • Encoding (Big5, GB2312, UTF-8) • Retrieve HTML code with GET method • Traverse multiple pages • Trace Javascript code and re-implement it in Perl • Completely pretend itself as a human + browser