Topic Modeling using Semantic and Network structure

Topic Modeling using Semantic and Network structure Sophia(Xueyao) Liang CPSC 503 Final Project

Topic modeling Olympic, vancouver Snow, cold K=3 Moon light, spider man P( |d) Unsupervised P( |d) P( |d)

plsa

plsa zk∈{z1,z2,…,zN}

Plsa - Parameter inference Expectation: Maximization:

PHITS

Semantic + network

NetPLSA

NetPLSA Parameter Inference: No closed form solution for expectation step • Efficient Algorithm: • Expectation (PLSA) • Maximization(PLSA) • The result of the previous steps may not ends in better value for O

NetPLSA • Potential Problems of the model • Parameter Inference • Higher time complexity and slower to converge -10000 100

CORPUS • Cora Data version 1.0 • Cited paper not in the corpus • No abstract for some post-script files • Too many categories • Duplicated or isolated papers 30000 scientific papers, with citation information Important files: papers (ID-name, link, author…..) citations (ID-cited ID) classifications (link-category) directory: extractions (post-script form of the papers)

CORPUS • Cora Data version 1.0 • Papers in category Machine Learning • About 2700 papers • 1400 Frequent Words (stop words removed, stemmed)

Results

Results Overall Accuracy (A) Accuracy (B) Recall Accuray and Recall for each category

EvALUATION • Justified the claim that adding network structure into the model could improve the result of topic modeling • Modeled the network on a scale of articles • Inherent problem exists in the picked framework • The result is still far from satisfactory

Future work • How to model the network structure of blog articles, especially considering model them on a scale of articles • Bag-of-words matrix extraction • Better integral model, maybe LDA based • Efficiency of the algorithm • Recommendation based on topic communtiy discovery

Topic Modeling using Semantic and Network structure

Topic Modeling using Semantic and Network structure

Presentation Transcript

Modeling Structure

Semantic Structure from Motion

Topic 5 -Semantic Analysis

Text Mining and Topic Modeling

Joint Enhancement of Topic Modeling and Information Network Mining

Semantic Internet Searching Using Active Structure

Semantic Network Theory

Semantic Network Theory

Predicting 3D Protein Structure using Homology Modeling

Network Design Using Mathematical Modeling and Optimization

Semantic Content based Modeling

Topic modeling

Language Modeling using PLSA-Based Topic HMM

Semantic Data Modeling Concepts

TOPIC : Memory modeling

Semantic Modeling With OWL

Modeling the Dermoscopic Structure Pigment Network Using a Clinically Inspired Feature Set

Topic Maps and the Semantic Web

iTopicModel: Information Network-Integrated Topic Modeling

Topic Modeling using Semantic and Network structure

Network Modeling