1. AAPL Test pTrees(e.g., Tweets) 1. DpTrees (indexed by (T,P) ) 2. TpTrees (indexed by (D,P) )Pos 1 2… ... Terms (Vocab) 5 6 Tweet3 .Docs Tweet2 Tweet1 always. apple buy AAPL April an are all and Doc Tweet3 Tweet2 Tweet1 Sentiment analysis (by doc) : PSB: Positive Sentiment BitMap, 1 iff doc has positive AAPL sentiment. PSA: Positive Sentiment Array, measures positive AAPL sentiment level? PSA for each term? Might term context change the sentiment? Position info gives PSV in context! PSB,PSV derived by hand ? (humans r assign PSB,PSV). SA is typically done on small datasets (yesterday’s AAPL tweets) with NLP. We need a Killer Idea? • A new way to do SA? (on data Too Big for the Horizontal Guys (TBHG)? • A new SA signal which is faster/more_accurate/uses_vertical_data? • Position info for phrase analysis, but then pTrees are not necessary (id “key” phrases as having same starting point as their first term.). 1 1 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 2nd word 0 0 0 0 0 0 0 0 0 0 0 0 Best approach? Compose a historical TBHG (deep and wide) TrainingSet, TS(stock,day,.) containing a column for every pertinent Wall Street and Main Street input (e.g., Russia invades Kiev!) on each stock on each day. Class label: increased/decreased (by a threshold % the next day?). Build a Hulls around the “increased” and “decreased’ classes. Classify each stock each day. Buy increasers on margins. Sell decreasers short. A SA column would be a pertinent Wall Street measurement column. Docs 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 buy 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 again 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 all 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 always. 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 an Twt3 Tweet2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 and 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 apple 0 0 0 0 0 0 0 0 0 April 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Tweet1 1. DpTrees (index by (T,P) ) Position 1 2 3 4 5 6 7 1st wd 2-word phrase start position 1 2 3 4 5 6 OR for Existential AAPL pTrees. Sum for AAPL tf Term buy AAPL 4D cube for 2-wd phrases? (Overkill?) all always 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 . . . 0 . . . 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 . . . 0 . . . 1 . . . 0 . . . 1 . . . 0 . . . 0 . . . 0 . . . 0 . . . 0 . . . 0 0 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 . . . an Doc Tweet3 Tweet2 Tweet1 0 0 0 0 0 0 0 . . . Term a Etc. PpTrees 0 0 0 0 0 0 0 and 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 . . . apple 0 0 0 0 0 0 0 . . . 0 0 0 0 0 0 0 . . . April 3. PpTrees (indexed by (T,D) )) OR row gives Term=a ExistentialTerm pTree. Sum gives Term=a DocFreq (df) array 7 1 2 3 4 are Position…

2. A2564 (256 AAPL tweets) (Always set the Stride to a multiple of the register size!) 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 9 40 1 2 3 4 5 6 7 8 9 50 1 2 3 4 5 6 7 8 9 60 1 2 3 4 5 6 7 8 9 70 1 2 3 4 5 6 7 8 9 80 1 2 3 4 5 6 87 big potentiawin at LG and $AAPL (i Watch potentia$VHC reiteratBuy and 65 target @ Gilford.Expect $AAPL trial to proceed and confidenthat$VHC will win Glad to be back from my sabbaticgreat calls around today $HLF$AAPL etc etc... Multiple$AAPL key\ Sr. Employeeresginedysdy check out this flow chart http://t- no alarm but never good. A2564 pTrees (256 AAPL Tweet Docs, 296 Terms, 29 Positions) 181 indicating 182 Huberty 183 http://t.co/SXzX 184 http://t.co/Ny6D 185 http://t.co/jZVU 186 http://t.co/fuCh 187 Holly 188 highlighted 189 highlight 190 high 191 harsh 192 Hallum 193 guys 194 great 195 good 196 going 197 Global 198 Glad 199 Gilford. 200 get 201 Gene 202 gap 203 gain 204 future 205 follow\ 206 follow 207 focusing 208 flow 209 find 210 Expect 211 event 212 essentially 213 engineer 214 Employees 215 Electronics 216 early 217 down 218 double 219 documents 220 dividend 221 defense 222 defending 223 defended 224 deal 225 days 226 day 227 CY 228 custy 229 Craig 230 controller 231 Consumer 232 confirming 233 confident 234 conference 235 comp. 236 comments 237 come 238 CNBC 239 cloud 240 citing 241 check 242 cheap 243 chart 244 CEO 245 ceo 246 case\ 247 Capital 248 Cap 249 calls 250 call 251 buy 252 Buy 253 bulls 254 bull 255 bubbles!! 256 Bill 257 bigger 258 big 259 believe 260 belief 261 Barclays 262 Bad 263 back 264 AT&T 265 astute 266 associates 267 asking 268 around 269 Are 270 Apple 1$ZAGG 2 $VZ 3$VHC 4 $T 5$SIMO 6 $RIMM 7$OVTI 8 $NFLX 9$JCP 10 $IDTI 11$HLF 12 $HIVE 13$GLUU 14 $FORD 15$CTRL 16 $BIDU 17$BABA 18 $AMZN 19$AAPL 20 you 21 yesterday 22 year. 23 Xiaomi's 24 wrong! 25 World 26 win 27 week 28 Wedge 29 Wedbush 30 wearable 31 Watch 32 very 33 Verizion 34 usual 35 upgrade 36 up 37 unit 38 TV 39 trial 40 trading 41 traction 42 Topeka 43 top 44 tonite 45 tomorrow 46 told 47 today 48 times 49 Time 50 thinks 51 they 52 there 53 their 54 tgt 55 teardown 56 target..why 57 target 58 taking 59 sure 60 suing 61 stream$AAPL 62 store 63 stock 64 Stellar 65 Stanley) 66 Sr. 67 Square 68 spoke 69 speaking 70 Sources 71 soon 72 site 73 side 74 showing 75 Show 76 shots 77 shop 78 several\ 79 service. 80 Sell 81 sell 82 scurrying 83 Schack 84 Says 85 says 86 saying 87 Samsung 88 sales 89 said 90 sabbatical 91 RT 92 rexxam 93 retract 94 retail 95 resgined 96 reports 97 reporting 98 reiterated 99 reit 100 REIT 101 Redog.great 102 Redog 103 raising 104 raises 105 Put 106 products 107 proceed 108 price 109 pretty 110 presenting 111 prepping 112 predictions 113 PR 114 potential). 115 potential 116 poor 117 Pipers 118 phones 119 Phison 120 Perhing 121 people 122 pay 123 Partners 124 own 125 out 126 ouch!! 127 OTR 128 opening 129 NYU 130 now 131 note 132 not 133 no 134 nite 135 nice 136 next 137 news. 138 new 139 New 140 Need 141 Munster 142 Multiple 143 movie 144 Morgans 145 Monday 146 Mobile 147 Mkts 148 Minster 149 mid 150 Markets 151 making 152 LQMT 153 Lotta 154 Looks 155 looking 156 long? 157 link 158 line 159 likely 160 like 161 LG 162 level 163 less 164 later 165 language 166 Korea 167 key\ 168 Katy 169 JPM 170 JP 171 journalism 172 jefferies 173 Jefferies 174 JCP 175 iWatch 176 isnt 177 is 178 iphone 179 iPhone 180 initating Concatenate all levels? AND, analyze, count (if multiple 1-bits in a stride, duplicate the next level down (and indicate where dups are by a mask?). PpTrees (Doc1) 3Level, Stride=4 T# T# Level=2 1 9 1 0 3 1 1 0 2 5 8 1 0 1 6 1 1 0 1 1 4 1 0 1 9 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 5 1 0 3 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 5 8 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 5 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 6 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 4 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 1 0 TpTrees 3Level Sri=10 (Doc1) P# 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 29 2 0 1 0 TpTrees (Doc1) P# T# 2 3 5 7 9 10 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 5 0 0 0 0 0 0 6 0 0 0 0 0 0 7 0 0 0 0 0 0 8 0 0 0 0 0 0 9 0 0 0 0 0 0 10 0 0 0 0 0 0 11 0 0 0 0 0 0 12 0 0 0 0 0 0 13 0 0 0 0 0 0 14 0 0 0 0 0 0 15 0 0 0 0 0 0 16 0 0 0 0 0 0 17 0 0 0 0 0 0 18 0 0 0 0 0 0 19 0 0 0 1 0 0 20 0 0 0 0 0 0 21 0 0 0 0 0 0 22 0 0 0 0 0 0 23 0 0 0 0 0 0 24 0 0 0 0 0 0 25 0 0 0 0 0 0 26 0 1 0 0 0 0 27 0 0 0 0 0 0 28 0 0 0 0 0 0 29 0 0 0 0 0 0 30 0 0 0 0 0 0 31 0 0 0 0 1 0 32 0 0 0 0 0 0 33 0 0 0 0 0 0 34 0 0 0 0 0 0 35 0 0 0 0 0 0 36 0 0 0 0 0 0 37 0 0 0 0 0 0 38 0 0 0 0 0 0 39 0 0 0 0 0 0 40 0 0 0 0 0 0 41 0 0 0 0 0 0 42 0 0 0 0 0 0 43 0 0 0 0 0 0 44 0 0 0 0 0 0 45 0 0 0 0 0 0 46 0 0 0 0 0 0 47 0 0 0 0 0 0 48 0 0 0 0 0 0 49 0 0 0 0 0 0 50 0 0 0 0 0 0 51 0 0 0 0 0 0 52 0 0 0 0 0 0 53 0 0 0 0 0 0 54 0 0 0 0 0 0 55 0 0 0 0 0 0 56 0 0 0 0 0 0 57 0 0 0 0 0 0 58 0 0 0 0 0 0 59 0 0 0 0 0 0 60 0 0 0 0 0 0 61 0 0 0 0 0 0 62 0 0 0 0 0 0 63 0 0 0 0 0 0 64 0 0 0 0 0 0 65 0 0 0 0 0 0 66 0 0 0 0 0 0 67 0 0 0 0 0 0 68 0 0 0 0 0 0 69 0 0 0 0 0 0 70 0 0 0 0 0 0 71 0 0 0 0 0 0 72 0 0 0 0 0 0 73 0 0 0 0 0 0 74 0 0 0 0 0 0 75 0 0 0 0 0 0 76 0 0 0 0 0 0 77 0 0 0 0 0 0 78 0 0 0 0 0 0 79 0 0 0 0 0 0 80 0 0 0 0 0 0 81 0 0 0 0 0 0 82 0 0 0 0 0 0 83 0 0 0 0 0 0 84 0 0 0 0 0 0 85 0 0 0 0 0 0 86 0 0 0 0 0 0 87 0 0 0 0 0 0 88 0 0 0 0 0 0 89 0 0 0 0 0 0 90 0 0 0 0 0 0 91 0 0 0 0 0 0 92 0 0 0 0 0 0 93 0 0 0 0 0 0 94 0 0 0 0 0 0 95 0 0 0 0 0 0 96 0 0 0 0 0 0 97 0 0 0 0 0 0 98 0 0 0 0 0 0 99 0 0 0 0 0 0 3 1 0 0 5 0 1 0 7 1 0 0 9 1 0 0 10 0 1 0 200 0 0 0 0 0 0 201 0 0 0 0 0 0 202 0 0 0 0 0 0 203 0 0 0 0 0 0 204 0 0 0 0 0 0 205 0 0 0 0 0 0 206 0 0 0 0 0 0 207 0 0 0 0 0 0 208 0 0 0 0 0 0 209 0 0 0 0 0 0 210 0 0 0 0 0 0 211 0 0 0 0 0 0 212 0 0 0 0 0 0 213 0 0 0 0 0 0 214 0 0 0 0 0 0 215 0 0 0 0 0 0 216 0 0 0 0 0 0 217 0 0 0 0 0 0 218 0 0 0 0 0 0 219 0 0 0 0 0 0 220 0 0 0 0 0 0 221 0 0 0 0 0 0 222 0 0 0 0 0 0 223 0 0 0 0 0 0 224 0 0 0 0 0 0 225 0 0 0 0 0 0 226 0 0 0 0 0 0 227 0 0 0 0 0 0 228 0 0 0 0 0 0 229 0 0 0 0 0 0 230 0 0 0 0 0 0 231 0 0 0 0 0 0 232 0 0 0 0 0 0 233 0 0 0 0 0 0 234 0 0 0 0 0 0 235 0 0 0 0 0 0 236 0 0 0 0 0 0 237 0 0 0 0 0 0 238 0 0 0 0 0 0 239 0 0 0 0 0 0 100 0 0 0 0 0 0 101 0 0 0 0 0 0 102 0 0 0 0 0 0 103 0 0 0 0 0 0 104 0 0 0 0 0 0 105 0 0 0 0 0 0 106 0 0 0 0 0 0 107 0 0 0 0 0 0 108 0 0 0 0 0 0 109 0 0 0 0 0 0 110 0 0 0 0 0 0 111 0 0 0 0 0 0 112 0 0 0 0 0 0 113 0 0 0 0 0 0 114 0 0 0 0 0 1 115 1 0 0 0 0 0 116 0 0 0 0 0 0 117 0 0 0 0 0 0 118 0 0 0 0 0 0 119 0 0 0 0 0 0 120 0 0 0 0 0 0 121 0 0 0 0 0 0 122 0 0 0 0 0 0 123 0 0 0 0 0 0 124 0 0 0 0 0 0 125 0 0 0 0 0 0 126 0 0 0 0 0 0 127 0 0 0 0 0 0 128 0 0 0 0 0 0 129 0 0 0 0 0 0 130 0 0 0 0 0 0 131 0 0 0 0 0 0 132 0 0 0 0 0 0 133 0 0 0 0 0 0 134 0 0 0 0 0 0 135 0 0 0 0 0 0 136 0 0 0 0 0 0 137 0 0 0 0 0 0 138 0 0 0 0 0 0 139 0 0 0 0 0 0 140 0 0 0 0 0 0 141 0 0 0 0 0 0 142 0 0 0 0 0 0 143 0 0 0 0 0 0 144 0 0 0 0 0 0 145 0 0 0 0 0 0 146 0 0 0 0 0 0 147 0 0 0 0 0 0 148 0 0 0 0 0 0 149 0 0 0 0 0 0 150 0 0 0 0 0 0 151 0 0 0 0 0 0 152 0 0 0 0 0 0 153 0 0 0 0 0 0 154 0 0 0 0 0 0 155 0 0 0 0 0 0 156 0 0 0 0 0 0 157 0 0 0 0 0 0 158 0 0 0 0 0 0 159 0 0 0 0 0 0 160 0 0 0 0 0 0 161 0 0 1 0 0 0 162 0 0 0 0 0 0 163 0 0 0 0 0 0 164 0 0 0 0 0 0 165 0 0 0 0 0 0 166 0 0 0 0 0 0 167 0 0 0 0 0 0 168 0 0 0 0 0 0 169 0 0 0 0 0 0 170 0 0 0 0 0 0 171 0 0 0 0 0 0 172 0 0 0 0 0 0 173 0 0 0 0 0 0 174 0 0 0 0 0 0 175 0 0 0 0 0 0 176 0 0 0 0 0 0 177 0 0 0 0 0 0 178 0 0 0 0 0 0 179 0 0 0 0 0 0 180 0 0 0 0 0 0 181 0 0 0 0 0 0 182 0 0 0 0 0 0 183 0 0 0 0 0 0 184 0 0 0 0 0 0 185 0 0 0 0 0 0 186 0 0 0 0 0 0 187 0 0 0 0 0 0 188 0 0 0 0 0 0 189 0 0 0 0 0 0 190 0 0 0 0 0 0 191 0 0 0 0 0 0 192 0 0 0 0 0 0 193 0 0 0 0 0 0 194 0 0 0 0 0 0 195 0 0 0 0 0 0 196 0 0 0 0 0 0 197 0 0 0 0 0 0 198 0 0 0 0 0 0 199 0 0 0 0 0 0 T# Level=all T# Level=1 1 9 1 0 0 1 0 0 0 0 1 0 3 1 1 0 0 0 1 0 1 0 0 0 2 5 8 1 0 1 0 0 0 1 0 0 0 1 6 1 1 0 0 1 0 0 1 0 0 0 1 1 4 1 0 0 0 1 0 0 1 0 0 11 5 1 0 10000100 3 1 0 0 1 0 2 5 8 1 0 0 0 11 5 1 0 0 0 1 6 1 0 1 0 0 1 9 0 1 0 0 1 1 4 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 6 1 0 1 0 0 0 0 0 1 0 2 6 1 0 0 0 T# Level=0 3 1 1 0 0 0 1 9 0 0 1 0 2 5 8 1 0 0 0 11 5 0 1 0 0 1 6 1 1 0 0 0 1 1 4 0 1 0 0 2 6 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 Actual sizes?: Instead of T=296,P=29, D=256 DpTree . . .$VZ,P=20 D# DpTree $ZAGG,P=1 D# TpTtrees: 1Level: (uncompressed) T=100K=217bits of 248 pTrees. 3Levels: Stride=26=64, 256=28 bits  of 248 pTrees = 256 bits (So 3Level reduces each pTree by factor of 512. Processing step seldom involves > ~hundreds of the 248 pTrees)$ Z A G G P1 $V Z P20 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 9 40 1 2 3 4 5 6 7 8 9 50 1 2 3 4 5 6 7 8 9 60 1 2 3 4 5 6 7 8 9 70 ... 296 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 … 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 9 40 1 2 3 4 5 6 7 8 9 50 1 2 3 4 5 6 7 8 9 60 1 2 3 4 5 6 7 8 9 70 ... 296 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 … 0 DpTreea 3Lev, Str=16 1 0 1 0 • 240 0 0 0 0 0 0 • 241 0 0 0 0 0 0 • 242 0 0 0 0 0 0 • 243 0 0 0 0 0 0 • 244 0 0 0 0 0 0 • 245 0 0 0 0 0 0 • 246 0 0 0 0 0 0 • 247 0 0 0 0 0 0 • 248 0 0 0 0 0 0 • 249 0 0 0 0 0 0 • 250 0 0 0 0 0 0 • 251 0 0 0 0 0 0 • 222 0 0 0 0 0 0 • 253 0 0 0 0 0 0 • 254 0 0 0 0 0 0 • 255 0 0 0 0 0 0 • 256 0 0 0 0 0 0 • 257 0 0 0 0 0 0 • 258 0 0 0 0 0 0 • 259 0 0 0 0 0 0 • 260 0 0 0 0 0 0 • 261 0 0 0 0 0 0 • 262 0 0 0 0 0 0 • 263 0 0 0 0 0 0 • 264 0 0 0 0 0 0 • 265 0 0 0 0 0 0 • 266 0 0 0 0 0 0 • 227 0 0 0 0 0 0 • 268 0 0 0 0 0 0 • 269 0 0 0 0 0 0 • 270 0 0 0 0 0 0 • 271 0 0 0 0 0 0 • 272 0 0 0 0 0 0 • 273 0 0 0 0 0 0 • 274 0 0 0 0 0 0 • 275 0 0 0 0 0 0 • 276 0 0 0 0 0 0 • 277 0 0 0 0 0 0 • 278 0 0 0 0 0 0 • 0 0 0 0 0 0 • 0 0 0 0 0 0 • 281 0 0 0 0 0 0 282 0 0 0 0 0 0 • 283 0 0 0 0 0 0 • 284 0 0 0 0 0 0 • 285 0 0 0 0 0 0 • 0 0 0 0 0 0 • 287 0 0 0 0 0 0 • 288 0 0 0 0 0 0 • 289 0 0 0 0 0 0 • 290 0 0 0 0 0 0 • 291 0 0 0 0 0 0 • 292 0 0 0 0 0 0 • 293 0 0 0 0 0 0 • 294 0 0 0 0 0 0 • 295 0 0 0 0 0 0 • 296 0 0 0 0 0 0 PpTtrees: 1Level (Depends on Docs) P=256=28 bits 257 pTrees (=265 bits) Processing step seldom involves > ~hundreds of the 257 pTrees) 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DpTtrees: 1Level (uncompressed) D=1T=240bits  225 pTrees(=265 bits) 5Levels: Stride=28=256 28 bits  of 225 pTrees So 5Level DpTrees reduce each pTree by factor of ~232=4B. Concatenate 5 levels into one 1280 bit vector per (T,P) DpTree ? Note, there is never more than 1 1-bits in each (T,P)DpTree. For most (T,P)’s there are no 1-bits. i.e., the (T,P)pTree is pure0). So there’s never a need to duplicate or extend to >1280 bits. DpTreeSet is stored in optimal processing form and losslessly as 225=~32,000,000 Levels=5 Strides=64 Concatenated DpTrees. That’s ~235bits = 232bytes = 4B bytes 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 271 Apple 272 Another 273 analyst 274 Amazon 275 alarm 276 ago 277 again 278 admits 279 added 280 Activatrions 281 activations 282 activated 283 Ackman 284 Ackman 285 AAPL’s 286 910 287 65 288 64bit 289 4.3M 290 2015 291 126 292 115 293 1111 294 1001 295 1000 296 0.54% 6. Stock Market Prediction Compose a historical TB2H (TooBigToBeHorizontal) TrainingSet, TB2H(Time, Stock, WallStreetMeasurement1,.., WSMw, MainStreetMeasurement1,…,MSMm, Class) which contains a column for every pertinent Wall Street and Main Street input (e.g., Russia invades Kiev!) on each stock on each historical Time period from the “current” time period (e.g., Time Unit might be days, hours, microseconds, …). So WSMkCurrentTimeUnit, WSMkOneTimeUnitAgo, … The distance function would typically be weighted lower the more aged the input (e.g., by 1/(1+exp(DaysAgo)) and weighted zero for future inputs. And distance( ($AAPL,Today), ($GOOG, 100DaysAgo) ) would be the weighted distance backwards from Today for$AAPL matched with weighted distance backwards from 100DaysAgo for $GOOG. Note: For$AAPL, e.g., there is a record for each succeeding day is widened by w+m new columns The Class Column might be: StockPrice increased/decreased (by a threshold % during that Time period?). Strategy1: Build a Hulls around the “increased” and “decreased’ classes. Classify each stock each new Time period. Buy increasers on margins. Sell decreasers short. The SA during each preceding Time period would be pertinent Wall Street measurement column. Strategy2: Use a NearestNeighborSetVote strategy. Invest according to the decisiveness of the vote. $AAPL record after n days: WSM DayOfWSM DayOfRecordCreation Dayn,$AAPL, CLASS, WSM1,1 … WSMw,1 , MSM1,1 … MSMm,1 ,..., WSM1,n … WSMw,n , MSM1,n … MSMm,n $AAPL record after n+1 days: Dayn+1,$AAPL, CLASS, WSM1,1 … WSMw,1 , MSM1,1 … MSMm,1 ,..., WSM1,n … WSMw,n , MSM1,n … MSMm,n , WSM1,n+1 … WSMw,n+1 , MSM1,n+1 … MSMm,n+1 For the distance between these two records, Shift as follows and then SQRT(sum of component differences squared): Dayn, $AAPL, CLASS, WSM1,1 … WSMw,1 , MSM1,1 … MSMm,1 , . . . , WSM1,n … WSMw,n , MSM1,n … MSMm,n Dayn+1,$AAPL, CLASS, WSM1,1 … WSMw,1 , MSM1,1 … MSMm,1 , WSM1,2 … WSMw,2 , MSM1,2 … MSMm,2 , . . . , WSM1,n+1 … WSMw,n+1 , MSM1,n+1 … MSMm,n+1 To all 879 and 790 Seminar students: There are many many of you who will be giving a paper (someone elses paper in the seminar and your own paper in the advanced data mining.) Thus we need some mechanism to fit all the papers in the time remaining (and I understand that it will be a few weeks at least before anyone is ready). I propose that you all prepare your presentation as a stand alone presentation with embedded audio (as per my 765 and 789 lecture notes). That way we can all have the benefit of your presentation regardless of the time constraints. Also, another topic category is to take one of the topic areas of the notes I have been sending (and that are online since 2011 on my web site) prepare a comprehensive set of audio enhanced notes covering what you find in all those notes plus anything you discover while doing that.