1 / 17

Using Blind S earch a nd F orm al Concepts for Binary Factor Analysis

Using Blind S earch a nd F orm al Concepts for Binary Factor Analysis. Aleš Keprt ales.keprt @vsb.cz. Synopsis. Bin ary Fa ctor Analysis (BFA) - introduction to BFA - exact solution of BFA - quality checking Possible optim izations Blind S earch method

arama
Télécharger la présentation

Using Blind S earch a nd F orm al Concepts for Binary Factor Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Blind SearchandFormal ConceptsforBinary Factor Analysis Aleš Keprt ales.keprt@vsb.cz

  2. Synopsis • Binary Factor Analysis (BFA)- introduction to BFA- exact solution of BFA- quality checking • Possible optimizations • Blind Search method • Method based on Formal Concepts • Testsand the comparison of methods • Possible future work

  3. Binary Factor Analysis (BFA) • Factor analysis of binary data • Using boolean arithmetic • Trying to express matrix X as a product of two matrices • is binary matrix multiplication • Initial conditions:we know X, dimensions of all matrices,number of one’s per row of A

  4. Exact BFA Solution(i.e. reference factorizer) • For checking other algorithms • Searches for the best (optimal) solution • Exact = opposite toapproximatesolution The key:Perform all possible optimizations to avoid checking of all bit-combinations

  5. Boolean arithmetic • Classical arithmetic: • Boolean arithmetic:

  6. Quality check • Discrepancy (česky „odchylka”) • Our goal is to minimize discrepancy • We distinguish between positive and negative discrepancy

  7. Possible optimizations • Empty rows and columns – skip • Duplicate rows and columns – merge • Order factor loadings (rows of A) • Constant number of one’s per row of A • Use the knowledge of A to get F • Parallel (distributed) computaion using multiple computers

  8. Quality check • When merging duplicate rows/columns:

  9. Blind Search • Blind Search = „slepé hledání” • The strategy:1. Build up particular candidate for A2. Find the best F for this A3. Compute discrepancy4. Remember the best A,F pair so far5. Back to step 1

  10. Blind Search (2.) • The key tasks:1. Building up the candidates for A2. How to get F when knowing A • Building up the candidates for A:- row by row (one row  one factor)- bit-coded matrices can be much faster- factors cannot repeat on several rows

  11. Blind Search (3.) • How to get F when knowing X and A:- bit-coded matrices may help (yet again)- going row by row (better than cassical„rows x columns”multiplication algo) • How to get the particular row of F:(a) blindsearch (b) do some preprocessing, then blindsearch

  12. Formalcontext

  13. Formalconcept

  14. Using Formal Concepts • Concepts are our candidates for the rows of matrix A • Example: set“p3”- data matrix size100x100- 10 concepts (see pict.) • Searching for 5 factors:- only 8 candidates- 56 possible combinations

  15. Details • Concepts generate no negative discrepancy higher computation error higher semantic value (for us) • We can get negative discrepancy when computing F (as discussed before) • Performed tests gave promising results

  16. Final Comparison • The presented algo’s are very similar • Giving the same results in our tests • Computation times are very different- times of set “p2” (mentioned before): blindsearch: approx. 3x109 yearsconcepts: 7 seconds

  17. Possible Future Work • Current implementaion uses independent application to compute concepts lattices • Integration to a single application may speed up the computation • We don’t need to compute whole concept lattice, even don’t need to know all concepts • Need to find better algorithm for binary matrix pseudo-division F = X/A

More Related