1 / 98

Bracmat

Bracmat. A guided tour Bart Jongejan 2013. The name Applications Core Methods Why Bracmat? Code examples Documentation Development Download Finale. 1741 Country on planet Nazar, inhabited by juniper trees with good facilities for astronomy, transcendental philosophy and mining.

elton
Télécharger la présentation

Bracmat

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bracmat A guided tour Bart Jongejan 2013

  2. The name Applications Core Methods Why Bracmat? Code examples Documentation Development Download Finale

  3. 1741 Country on planet Nazar, inhabited by juniper trees with good facilities for astronomy, transcendental philosophy and mining. Niels Klim, by Ludvig Holberg (1684-1754). 2013 Software for analysis and transformation of uncharted and complex data. Bracmat(brachiat. – w. branches)

  4. Examples of Applications • HTML cleaning • validation of text corpora • extraction of tabular data from text • semantic analysis of text • automatic workflow creation • computer algebra • investigation of email chains

  5. HTML cleaning • ensure standard header and footer • check links • add closing tags • warn if element not allowed in context • remove or translate disallowed attributes • translate deprecated elements (font, center) • remove redundant elements (small big)

  6. Validation of text corpora • Dutch corpora (the Netherlands/Flanders): • CGN(2006), MWE (2007), D-COI (2008), • DPC (2010), Lassi (2011), SoNaR (2012) • XML wellformedness, tag usage, sampling, visualisation for manual tasks, statistics, tabular parts of reports.

  7. Data extraction from text

  8. "Skal jeg tisse mere af diabetes?" (“Do I have to urinate more because of diabetes?”) First: tokenizer, tagger (opennlp), parser (mate-tools) Then: using patterns, find relation and concepts in parse tree. Result: Semantic analysis Polyuri DUE TO diabetes mellitus

  9. Automatic Workflow Creation

  10. Computer algebra • Face tracker: frame-by-frame video analysis • Head gestures: velocity, acceleration y: pixel position ∝ muscle force c: head acceleration b: head velocity a: head position x: frame #

  11. Computer algebra • Solve three equations → acceleration Bracmat solution ( c . ( -1*St2^3 + 2*St*St2*St3 + St2*St4*period + -1*St^2*St4 + -1*St3^2*period ) ^ -1 * ( -1*Sh*St2^2 + Sh*St*St3 + St*St2*Sth + St2*St2h*period + -1*St3*Sth*period + -1*St^2*St2h ) ) Java code return ( Sh*(St*St3 - St2*St2) + Sth*(St*St2 - St3*period) + St2h*(St2*period - St*St) ) / ( St2*(2*St*St3 - St2*St2 + St4*period) - St*St*St4 - St3*St3*period );

  12. Email chains Received: from [192.38.108.156] (unknown [192.38.108.156]) by mailgate.sc.ku.dk (Postfix) with ESMTP; Thu, 11 Oct 2012 16:28:59 +0200 (CEST) From: bartj@hum.ku.dk To: hans Keller <hkeller@intrakt.nl> Date: Thu, 11 Oct 2012 16:28:59 +0200 MIME-Version: 1.0 Subject: Bracmat, GitHub CC: Bart.jongejan@gmail.com Message-ID: <5076D7AB.20793.DB4C11@bartj.hum.ku.dk> X-Confirm-Reading-To: bartj@hum.ku.dk X-pmrqc: 1 Priority: normal X-mailer: Pegasus Mail for Windows (4.63) Content-type: Multipart/Alternative; boundary="Alt-Boundary-3336.14371857" --Alt-Boundary-3336.14371857 • ( "Bracmat, GitHub" • , "Bart Jongejan" • , 2012 10 11 14 28 59 200 • , (.) • ) • ( Bracmat • , "Bart Jongejan" • , 2012 10 11 18 8 17 200 • , ( • . "Re: Bracmat" • , "Hans Keller" • , 2012 10 12 6 45 15 200 • , ( • . "Re: Bracmat"

  13. Core methods • composition • normalization • pattern matching • procedural logic

  14. Composition Compose complex expressions from simpler ones. complex expression binary operator another expression expression

  15. Normalization Automatically derivecanonical expressions from unnormalized ones. canonical expression arbitraryexpression

  16. Deconstruct complex expressions into simpler ones using pattern matching. Pattern matching complex expression pattern ? ? simpleexpressions

  17. Procedural logic complex expression ( pattern ? ? &  ) | 

  18. WHY? How does a test particle move, given a set of basis vectors and a specific metric? →symbolic algebra

  19. Symbolic manipulations easy, but MANY. Pen and paper: doubts about correctness. Computer: no errors. 1986: First version of Bracmat composes and normalises algebraic expressions. 1988: Pattern matching and procedural logic

  20. All Bracmat expressions are binary trees: + ^ ^ 2 x 3 * a y ^ ^ ( ) x 2 a y 3 + *

  21. Code examples

  22. keyboard input prompt {?} {!} {?} {!} {?} {!} 1+2 3 a+a+a 3*a b+a a+b answer “a” is a symbol, not a variable answer follows concise non standard order canonical order 3, 3*a anda+bare canonical forms of 1+2, a+a+a andb+a, respectively.

  23. Operators (initially): * multiplication + addition ^ exponentiation \L taking a logarithm \D taking a derivative NO operators for subtraction and division: a - b =a+(-1*b) a / b =a*(b^-1)

  24. Bracmat expressions autonomously seek toward stable states. Comparison: garbage falling on dump. Small things slide down through the voids. Chemicals interact. Fumes disappear. Finally all is quiet. This is the “Normal state”.

  25. Landfill expression: landfill=ashtray+5*bag+barbie+ 12*bottle+9*cork+stone+television Truck’s contents: truck=apple+3*bag+paper+phone Emptying the truck in the landfill: (!landfill + !truck) : ?landfill Landfill’s new stable state after a while: apple+ashtray+8*bag+barbie+12*bottle +9*cork+paper+phone+stone+television

  26. Landfill:not nice, but unwieldy & repulsive. Good News: there are gems in the landfill. If Hengki wants to obtain gems, he needs to: recognise valuable items and pick up those valuable items Jonathan McIntosh, 2004

  27. most of it doll pattern if you see doll, take it Hengki’s program !landfill: ?junk + ?n*((ken|barbie):?gem) + ?morejunk & & !junk+!morejunk+(!n+-1)*!gem : ?landfill | scan the landfill after doll seen, go on with next step !HengkiStuff+!gem:?HengkiStuff add doll to H.’s possessions and don’t return it to the landfill if no doll seen, landfill and Hengki’s possessions remain unchanged

  28. Four new binary operators: = : & | and two prefixes: ? ! bind rhs to symbol on lhs match lhs (subject) with rhs (pattern) do rhs if lhs succeeds do rhs if lhs fails capture a value and bind it to the adjacent symbol. produce the value that is bound to the adjacent symbol

  29. = : & |and\Devaluated away (normally). Dynamic forces that shake and break rubble. , and . do always persist through evaluation. Residual forces that keep things in place. Whitespace+ * ^and\Lcan persist, e.g.: y x → y x y+x → x+y y*x → x*y But: "" a, 0+a, 1*a → a, a, a

  30. Examples of data structures that don’t change when (re)evaluated. x^2,y^2,100 (.1 0 0) (.0 0 -1) (.0 1 0) BUT: (1 0 0) (0 0 -1) (0 1 0) → 1 0 0 0 0 -1 0 1 0 Because blank, comma and dot are binary operators, this sentence is a perfect Bracmat expression. 3 algebraic expressions separated by commas 9 numbers in a matrix Lists built with whitespace, + and * are always flattened!

  31. {?} Because blank, comma and dot are binary operators, the sentence you are reading is a perfect Bracmat expression. • {!} Because blank • , comma and dot are binary operators • , the • sentence • you • are • reading • is • a • perfect • Bracmat • expression • .

  32. Logical expansion of application domain of Bracmat as: “Software for analysis and transformation of uncharted and complex data.” textual Example: Check sentence syntax with Bracmat patterns:

  33. (S=!NP !VP) & (NP=!DET !N) & (VP=!V|!V !NP) & (DET=a|the) & (N=woman|man) & (V=shoots|kisses) & ( a man kisses the woman:!S & put$"That's grammatical!\n" | put$"not grammatical\n" ) non-terminals terminals rule application screen output if success screen output if failure

  34. Operator $ applies function to argument. Only few built-in functions, e.g.: get put lst str Function application: str$(I m p l o d e)→ Implode get input from file, keyboard or string write a result to file, screen or string serialize a variable to file, screen or string concatenate a tree into a single string

  35. Define your own functions. E.g. syntax checker: check= S NP VP DET N V . (S=!NP !VP) & (NP=!DET !N) & (VP=!V|!V !NP) & (DET=a|the) & (N=woman|man) & (V=shoots|kisses) & !arg:!S = only evaluates lhs. . before dot: declaration of local variables after dot: function body 'check' succeeds if match ok

  36. Call checkwith a sentence as argument: {?} check$(a woman shoots)&okay|no {!} okay {?} check$(a man a man shoots)&T|F {!} F

  37. (ROOT . (VERB.Skal.skal) (subj.PRON.jeg.jeg) ( vobj . (VERB.tisse.tisse) ( dobj . (ADJ.mere.mere) ( pobj . (ADP.af.af) (nobj.NOUN.diabetes.diabetes) ) ) ) (pnct.X."?"."?") ) PARSE TREE relation (‘attribute’) concept 1 concept 2

  38. […] | (its.hasTree) $ ( !arg . ( = (VERB.?.skal|skulle) ? ( vobj . ((VERB.?.?):?a) ( dobj . ?b (pobj.(ADP.af.af) ?LC2) ) ) ) ) & !a (dobj.!b):?LC1 ) & "DUE TO" Relation Pattern Whatever matches this … … must also match this. location of concept 2 Combine fragments location of concept 1 (fragmented) Why “!a” ? relation (‘attribute’)

  39. (its.hasTree) $ (!LC1 . ( = (VERB.?.tisse) (dobj.(ADJ.?.mere) ?) ) ) → (28442001.Polyuri) Concept 1 pattern concept 1

  40. (its.hasLemma) $ (!LC2.(=sukkersyge ?|diabetes ?)) → (73211009."diabetes mellitus ") Concept 2 pattern concept 2

  41. (attribute."DUE TO") ( concept1 . "Clinical Finding" . 28442001.Polyuri ) ( concept2 . "Clinical Finding" . 73211009."diabetes mellitus " )

  42. Initialise bigram accumulator any number of bytes, even none • "Is sinning sincere?":?Mytext • & 0:?Bi • & @( !Mytext • : ? • ( %?One %?Two ? • & (!One !Two)+!Bi:?Bi • & ~ • ) • ) • | lst$Bi subject pattern String pattern matching at least one byte accumulate embedded instructions fail! (backtrack)

  43. (Bi= • (" " i) • + 2*(" " s) • + (I s) • + (c e) • + (e "?") • + (e r) • + 3*(i n) • + 2*(n " ") • + (n c) • + (r e) • + (s " ") • + 2*(s i));

  44. 0:?Bi • & "из фрагментов текстов":?Mytext • & @( !Mytext • : ? • ( (%?One&utf$!One) • (%?Two&utf$!Two) • ? • & (!One !Two)+!Bi:?Bi • & ~ • ) • ) • | lst$Bi Require UTF-8 character

  45. Bi= • (" " т) • + (" " ф) • + (а г) • + (в " ") • + (г м) • + (е к) • + (е н) • + (з " ") • + (и з) • + (к с) • + (м е) • + (н т) • + 2*(о в) • + (р а) • + (с т) • + (т е) • + 2*(т о) • + (ф р);

  46. Parse 0n1n Example of recursive pattern. {?} AB= ( "" | 0 !AB 1 ) {?} 0 0 1 1:!AB & good | bad {!} good left hand side of | is ”nothing”. So this matches zero 0's and 1's. recurse

  47. Parse 0n1n2n {?} AB= ( "":?C | 0 !AB 1 & 2 !C:?C ) {?} ABC=!AB !C {?} 0 0 1 1 2 2:!ABC & good | bad {!} good if zero 0's and 1's then also zero 2's for each nested pair of 0 and 1, add a 2 to C after parsing n0's and n1's, C contains n2's

  48. Documentation http://jongejan.dk/bart/bracmat.html Most complete documentation. http://rosettacode.org/wiki/Category:Bracmat Over 170 examples that can be compared with implementations in other programming languages.

More Related