250 likes | 266 Vues
This workshop discusses a proposed framework that utilizes big data analytics to optimize portfolio decisions by considering both stock price data and current affairs data. The framework includes stages such as stock selection, validation, clustering, ranking, and optimization. The expected outcomes are informed investment decisions, diversification of risk, and top portfolio suggestions for investors.
E N D
Workshop on Internet and BigData Finance (WIBF 14) A Big Data Analytical Framework for Portfolio Optimization -DhanyaJothimani, Ravi Shankar, Surendra S Yadav Department of Management Studies Indian Institute of Technology Delhi India
Flow of Presentation • Introduction • Scope of Work • Gaps Identified • Objective • Proposed Framework • Expected Outcomes • Conclusion • References WIBF 14
Introduction • Big Data: Few hundred Gigabytes, Terabytes or Zettabytes! (Qin et al., 2012) • Dimensions: Volume, Velocity, Variety , Variability, Complexity, Low value density (Singh, 2012) • Use of big data technologies in capital firms (Verma & Mani, 2012) • Next Big Challenge for capital markets: To handle the velocity of data being produced (Verma & Mani, 2012) WIBF 14
Introduction • Types of Capital Market Data • Reference Data • Fundamental Data (corporate financials, etc) • News (Earning report, Economic News, etc) • Social Media (Market Sentiment, etc) (Rauchman & Nazaruk, 2013) WIBF 14
Introduction • Portfolio: Collection of assets • Portfolio Optimization: Process of making investment decisions on holding a set of financial assets to meet various criteria • Criteria: • Maximizing Return • Minimizing Risk (Markowitz, 1952) • Major Steps: • Asset selection • Asset Weighting • Asset Management WIBF 14
Scope of the work • Assets range from stocks, bonds to real estate • Scope of the framework is limited to Stock Analysis WIBF 14
Previous Works WIBF 14
Gaps Identified • To the knowledge of the authors: • No framework handles both structured and unstructured data for portfolio optimization • Generally, quantitative data is considered for fundamental analysis WIBF 14
Objective • To propose a framework that considers both stock price data and current affairs data (i.e news articles, market sentiments) to help investors to make an informed investment decision WIBF 14
Proposed Framework • 5- stage Process • Stage 1: Selection of Stocks • Stage 2: Validation of Stocks • Stage 3: Stock Clustering • Stage 4: Stock Ranking • Stage 5: Optimization WIBF 14
Fig 1: Proposed Framework WIBF 14
Stage 1: Selection of Stocks • Input: Listed firms in a particular stock exchange as DMU • Output: Efficient firms • Methodology: DEA, a non-parametric linear programming (Cooper et al., 2004) • Input parameters: Total assets, Total equity, Cost of sales and Operating expenses • Output parameters: Net sales and Net income (Chen, 2008; Chen & Chen, 2011) WIBF 14
Stage 2: Validation of Stocks • Input: Tweets and News articles of efficient firms • Output: Validated firms • Methodology: Sentiment analysis using Distributed text mining • Software framework:HadoopMapReduce • Properties: Ease-of-use, scalability and failover properties (Dittrich & Ruiz, 2012) WIBF 14
Effect of Management on Stock Price Source: http://www.firstbiz.com/data/graphic-how-infosys-stock-moved-since-nr-narayana-murthys-return-86318.html WIBF 14
Stage 2: Validation of Stocks Figure 2: Hadoop Framework for Distributed Text Mining WIBF 14
Stage 3: Stock Clustering • Input: Validated firms • Output: Clusters of firms • Methodology: • Correlation co-efficient of returns of stocks • Clustering algorithm: Louvian or k-means clustering (Blondel et al., 2008) • Number and Quality of clusters: • Maximize similarity within a cluster • Minimize similarity between clusters WIBF 14
Stage 4: Stock Ranking • Selection of appropriate stocks from each cluster • Input: Stocks in each cluster • Output: Stocks being ranked in each cluster • Methodology: ANN (Ware, 2005) • Input parameters: GDP growth rate, Interest rate • Output parameters: Future ROI (Koochakzadeh, 2013) WIBF 14
Stage 5: Optimization • How much to invest in each stock?! • Input: Ranked stocks • Methodology: Output:Top 3 portfolio suggestions • Markowitz Mean-Variance model (Markowitz, 1952) • Optimization heuristics WIBF 14
Expected Outcomes • Stage 1: Asset selection • Stage 2: Consideration of Qualitative aspect of firms • Stage 3: Diversification of risk • Stage 4: Aids in the selection of appropriate stocks • Stage 5: Top 3 portfolio suggestions to the investors WIBF 14
Conclusion • Informed investment decisions • Considers both structured and unstructured data • Flexibility to investors for selecting appropriate stocks • Applicability to any stock data • Limitation: • Considers only stock analysis • Asset management is not considered • Future Work: Implementation of the framework WIBF 14
References • E. Patari, T. Leivo, and S. Honkapuro, "Enhancement of equity portfolio performance using data envelopment analysis", European Journal of Operational Research, vol. 220, no. 3, pp. 786–797, 2012 • H.-H. Chen, “Stock selection using data envelopment analysis.,” Industrial Management and Data Systems, vol. 108, no. 9, pp. 1255–1268, 2008. • H. Markowitz, “Portfolio selection,” The Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952. • J. Bollen, H. Mao, and X Zeng, "Twitter mood predicts the stock market.", Journal of Computational Science, vol. 2, no. 1, pp. 1-8, 2011 • J. Dittrich and J.-A. Quian ́ -Ruiz, “Efficient big data processing in hadoop mapreduce,” eProc. VLDB Endow., vol. 5, pp. 2014–2015, Aug. 2012. • J. Gemela, "Financial analysis using bayesian networks", Applied Stochastic Models in Business and Industry, vol. 17, no. 1, pp. 57-67, 2001 • K. Karpio, P. Lukasiewicz, A. Orlowski, and T. Zabkowski, "Mining associations on the Warsaw stock exchange." Proceedings of the 6th Polish Symposium of Physics in Economy and Social Sciences (FENS2012), vol. 123, no. 3, pp. 553–559, 2013 • L. Bakker, W. Hare, H. Khosravi, and B. Ramadanovic, "A social network model of investment behaviour in the stock market.", Physica A: Statistical Mechanics and its Applications, vol. 389, no. 6, pp. 1223 – 1229, 2010 WIBF 14
References • M. Dia, "A portfolio selection methodology based on data envelopment analysis.", INFOR, vol. 47, no. 1, pp. 51–57, 2009 • M. Ismail, N. Salamudin, N. Rahman, and B. Kamaruddin, "DEA portfolio selection in Malaysian stock market.", 2012 International Conference on Innovation Management and Technology Research (ICIMTR), pp. 739–743, 2012 • M. Rauchman and A. Nazaruk, “Big Data in Capital Markets” Available online at: http://www.sigmod.org/2013/keynote1_slides.pdf, Accessed on: May 30, 2014 • N. C. P. Edirisinghe and X. Zhang, "Portfolio selection under DEA-based relative financial strength indicators: case of US industries", Journal of the Operational Research Society, vol. 59, no. 6, pp. 842–856, 2007 • N. Koochakzadeh, A heuristic stock portfolio optimization approach based on data mining techniques. PhD thesis, Department of Computer Science, University of Calgary, March 2013. • N. S. Sachchidanand Singh, “Big data analytics,” in International Conference on Communication, Information & Computing Technology (ICCICT), pp. 1-4, 2012. • P. Gupta, G. Mittal, and M. Mehlawat, "A multicriteria optimization model of portfolio rebalancing with transaction costs in fuzzy environment." Memetic Computing, pages 1–14, 2012 • R. P. Schumakerand H. Chen, "Textual analysis of stock market prediction using breaking financial news: The AZFin text system.", ACM Transactions Information Systems, vol. 27, no. 2, pp. 1–19, 2009 WIBF 14
References • R. Verma and S. R. Mani, “Use of big data technologies in capital markets.” Available online at: http://www.infosys.com/industries/financialservices/whitepapers/Documents/big-data-analytics.pdf, 2012, Accessed on: April 2, 2014. • S. Shen, H. Jiang, and T. Zhang, "tock market forecasting using machine learning algorithms. Available online at: http://cs229.stanford.edu/proj2012/ShenJiangZhang-StockMarketForecastingusingMachineLearningAlgorithms.pdf., 2012 • S. T. Li and Y. c. Cheng, "A stochastic HMM-based forecasting model for fuzzy time series.", IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 40, no. 5, pp. 1255–1266, 2010 • S. Y. Xu, "Stock price forecasting using information from yahoo finance and google trend."Available online at:https://www.econ.berkeley.edu/sites/default/files/Selene, 2012 • T. S. Quah, "DJIA stock selection assisted by neural network", Expert Systems with Applications, vol. 35, no. 12, pp. 50 – 58, 2008 • T. Ware, “Adaptive statistical evaluation tools for equity ranking models.” Submitted to: Canadian Industrial Problem Solving Workshops (Calgary, Canada, May 15-19, 2005), 2005. • V. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment, p. 10008, 2008. WIBF 14
References • W. Cooper, L. Seiford, and J. Zhu, “Data envelopment analysis: History, models and interpretations,” in Handbook on Data Envelopment Analysis (W. W. Cooper, L. M. Seiford, and J. Zhu, eds.), vol. 71 of International Series in Operations Research & Management Science, pp. 1–39, Springer US, 2004. • X. Qin, H. Wang, F. Li, B. Zhou, Y. Cao, C. Li, H. Chen, X. Zhou, X. Du, and S. Wang, “Beyond simple integration of RDBMS and MapReduce – Paving the way toward a unified system for big data analytics: Vision and progress,” in Proceedings of the 2012 Second International Conference on Cloud and Green Computing, CGC ’12, (Washington, DC, USA), pp. 716–725, IEEE Computer Society, 2012. • Y. S. Chen and B. S. Chen, “Applying DEA, MPI and Grey model to explore the operation performance of the Taiwanese wafer fabrication industry,” Technological forecasting and social change, vol. 78, no. 3, pp. 536–546, 2011. • Y. Zuo and E. Kita, "Stock price forecast using bayesian network", Expert Systems with Applications, vol. 39, no. 8, pp. 6729–6737, 2012 WIBF 14
Thank You WIBF 14