1 / 32

Thesis Presented By

Thesis Presented By. Mohammad Abul Kalam Azad C011054 Shabbir Ahmad C011051 Francis Palma Tony C013038 Supervised by S. M. Kamruzzaman Assistant Professor Department of Computer Science and Engineering International Islamic University Chittagong.

vrowland
Télécharger la présentation

Thesis Presented By

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thesis Presented By Mohammad Abul Kalam Azad C011054Shabbir Ahmad C011051Francis Palma Tony C013038Supervised by S. M. KamruzzamanAssistant Professor Department of Computer Science and Engineering International Islamic University Chittagong

  2. An Efficient Technique for Text Compression

  3. Summery of Presentation ☼ Introduction.☼ Methodology.☼ How Many Words.☼ Word Lookup Table. ☼ Word Storing Architecture. ☼ Memory Allocation Method. ☼ Memory Space Requirement. ☼ Data Management Algorithm.☼ Data Compression.☼ Experimental Result.☼ Conclusions.

  4. Introduction ☼ Data Storage. ☼ Information Management Using Symbol. ☼ Secret Data Communication. ☼ Storage Requirement Reduction Problem. ☼ Reduction. ☼ Compression.

  5. Methodology The data compression will be done in two phases: ☼ Reduction Using Lookup Table. ☼ Compression Using Deflate Algorithm.

  6. How Many Words in English Webster’s Third New International Dictionary 470,000 entries. The Oxford English Dictionary, second Edition, reports that it’s include similar number. Including: Abbreviation, Phrase, Taboo words, Dialects, Family words. Excluding: Names of entirely scientific terms.

  7. Word Lookup Table A word lookup table is a special tabular data file containing the text dimension of a word as an attribute of an address, which is used to pop up text to display the possible text data.

  8. Word Lookup Table A lookup table is defined by its capability of addressing. Expressed in bits. 2^ 19 = 524288. That is we need 19-bit lookup table.

  9. Word Storing Architecture Figure 1. The architecture of the stored word in lookup table

  10. Word Storing Architecture Table 1. Words in word lookup table

  11. Word Storing Architecture Table 2. Special situation handling addresses

  12. Word Storing Architecture Table 3. Entry of different punctuation signs

  13. Example Example 1 Example 2

  14. Example Example 3 Example 4

  15. Memory Space Requirement Space = 2^19 * 75 bits = 524288 * 75 bits = 39321600 bits = 4915200 Bytes = 4800 Kilo Bytes = 4.6875 Mega Bytes

  16. Memory Allocation Method Table 4. The Hash table

  17. Memory Allocation Method Figure 2: How the word lookup table will be stored

  18. Data Management Algorithm Algorithm UnRedToRed( ) 1. Read file 2. Read character to form a word until empty. 3. Finds its appropriate address from Hash table. 4. Find the word in Lookup Table. 5. If found then 6. Check case 7. If case = lower then 8. Fetch addresses 9. else 10. Do the case management 11. Fetch Address 12. Print the address 13. else 14. Give termination symbol. 15. Start ASCII storage (word) 16. Go to step 1. End.

  19. Data Management Algorithm Algorithm RedToUnRed ( ) 1. Read file 2. Fetch address. 3. Check Address status. 4. If word then, 5. Print the word. 6. If situation handles then, 7. Do according to it. 8. Go to step 2. End

  20. Data Compression Methods ☼ Lossy Data Compression. ☼ Lossless Data Compression.

  21. Lossless Data Compression ☼ Run Length Encoding (RLE). ☼ Huffman Coding. ☼ Lempel-Ziv 77 Encoding (LZ77). ☼ Deflate Algorithm.

  22. Run Length Encoding <Esc> Specific Character <Frequency> XXXXXXXXXXXXXXX that’s all, folks! <Esc> X <15> that’s all, folks!

  23. Huffman Coding Let a sequence of character with their frequencies are: 29 64 32 12 9 66 23 A B C D E F G Finally we get a new binary code for each character:

  24. Lempel-Ziv 77 Encoding the_rain_in_Spain_falls_mainly_in_the_plain the_rain_ the_rain_<3,3> So, in binary, the pointer <3,3> would look like this: 00 00000011 0011 the_rain_<3,3>Sp the_rain_<3,3>Sp<9,4> the_rain_<3,3>Sp<9,4>falls_m<11,3>

  25. Deflate Algorithm Combination of Huffman coding and Lempel-Ziv 77 encoding

  26. Experimental Result

  27. Comparison with other methods ☼ In general compression rate from 12% to highest 50%. ☼ Proposed method 53% reduction

  28. Comparison with other zip software

  29. Comparison with other zip software

  30. Conclusion ☼ Text Data Storage Reduced to 53%. ☼ After Compression 75% - 80%. ☼ Faster Portable.

  31. Questions

  32. Thank You

More Related