1 / 53

Mapping science using Bibexcel and Pajek By Olle Persson

Mapping science using Bibexcel and Pajek By Olle Persson. Relations. Units of analysis - document level - aggregated level: authors, universities, countries, journals … Citation based relationes - direct citations - shared references - co-citations Co-occurrences - co-authorships - co-word.

ciro
Télécharger la présentation

Mapping science using Bibexcel and Pajek By Olle Persson

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mapping science using Bibexcel and Pajek By Olle Persson

  2. Relations • Units of analysis- document level- aggregated level: authors, universities, countries, journals … • Citation based relationes- direct citations - shared references- co-citations • Co-occurrences- co-authorships- co-word

  3. C D Citatbased relations between dokuments A B A cites C = direct citation A and C both cites B = bibliografic coupling A and C afre borth cited by D = co-citation

  4. Similarity measures • Frequencies (raw counts)- n of direct of citations- n of co-occurences- n of shared references • Normalized measures- Salton’s index- Jaccard’s index- Pearsons correlation

  5. Mapping science • Preparing data • Calculating measures • Making maps Good if you have some experience with Pajek. You will learn the basics of Bibexcel in this tutorial!

  6. You will need this material • A set of datahttp://www8.umu.se/inforsk/esss/cocit569.tx2 • Bibexcel sofwarehttp://www8.umu.se/inforsk/Bibexcel/bibexcel.exe • Pajekhttp://vlado.fmf.uni-lj.si/pub/networks/pajek/ • Reading material1st chapter in: http://www8.umu.se/inforsk/Bibexcel/ollepersson60.pdf

  7. Preparing data

  8. Topic=(co-citation* OR cocitation*) Databases=SCI-EXPANDED, SSCI, A&HCI Timespan=All Years. Update 2011-03-04 1. Convert to Dialog format • We have already searched and downloaded 569 records from Web of Science on co-citation analysis and • We have already replaced line feeds with carriage return in the downloaded file using Bibexcel: Edit doc-file/Replace line feed with carriage return • The file to be used is cocit569.tx2 • Put Bibexcel.exe in c:\Bibexcel and coccit569.tx2 in c:\Bibexcel\Data • Start bibexcel.exe, and next we will have to convert to Dialog format that Bibexcel is designed for

  9. You can open Bibexcel and make all steps in this presentation! Select the cocit569.tx2 file and run Misc/Convert to Dialog format/Convert from Web of Science

  10. Select cocit569.doc and press View file Two letter field tag ; = Separates units | = End of field | |= End of record

  11. 2. Extracting data from CD- field (cited documents) Let’s start! Units are separated by semicolon Put tag here

  12. cocit569.out has the cited documents This is the reference list of doc nr 1

  13. To improve data quality the Edit out-files menu has several options. For example, you may wish to reduce variation by only allowing the 1st initial in author names. Select cocit569.out and run Edit out-files/Keep only author’s first initial 3. Refining the out-file

  14. Look at cocit569.1st and you can see that EOM SB is changed to EOM S

  15. Let’s improve a little bit more: Select cocit569.1st and run Edit outfiles/Convert Upper lower Case/Good for Cited reference strings

  16. Look at cocit569.low. I think this looks much nicer compared to the out-file!

  17. Calculating data

  18. 1. Looking at frequencies Select cocit569.low. Tick here Choose Whole string Press Start!

  19. Look at cocit569.cit which has the cited references in decreasing frequency!For anyone familiar with co-citation research, the top 3 papers shouldn’t come as a surprise.

  20. 2. Making co-citations Select the cocit569.cit-file, press View file. In The list, mark cited references down to frequency=30 and then press Copy, then Clear and then Paste. These are the references for which you want co-citations

  21. Select the cocit569.low-file, and run Analyze/Co-occurrence/Make pairs via listbox, and answer No to the next question, and OK for the question after that!

  22. The cocit569.coc had the co-citation frequencies. We will use that file for mapping!

  23. Select cocit569.coc and run Mapping/Create net-file for Pajek … be sure to answer No to the question if directed arcs, since we do not have any directions here.

  24. The cocit569.net file can be opened from within Pajek, Netdraw, Mapquation etc for drawing maps.

  25. Mapping with Pajek

  26. Open cocit569.net file in Pajek, and then Draw/Draw

  27. This is the first layout with randomly ordered nodes.To the upper left, chooseLayout/Energy/Kamada-Kawai/Separate components or just press Ctrl-K

  28. The Kamada-Kawai layout is better but still there is perhps too many lines in the graph, since almost everyone is connected to all others

  29. To reduce complexity minimize the draw window and then run Net/Transform/Remove/Lines with Value/lower than/ and put 10 in the box and answer yes to Make new network. After that run Draw/Draw again!

  30. This map ha more structure. We find that papers to the left and newer ones to the right. You can press Ctrl-K several times to see what happens

  31. Making vectors Making circles on nodes based on citation frequencies. Go to Bibexcel and select cocit569.cit and the run Mapping/Create vec-file. Below you can see that cocit569.vec is created

  32. Go back to Pajek. Open the Vector file cocit569.vec and then run Draw/Draw-Vector

  33. Now you can see that circles correspond to n of citations

  34. Making partitions • If you wish you can create a clu-file using Bibexcel that indicates the publication year, or decade of the cited documents. • Select cocit569.cit and run Edit out-file/Extract publication year from references • and you will get a file named cocit569.dpy. • Select cocit569.dpy and run Mapping/Create clu-file • and you will get a file named cocit569.clu • Go to Pajek and open cocit569.clu as partiotion • Run Draw/Draw-Partition-Vector and then in the draw window Layers/In y direction

  35. Makes sense?

  36. Using Options/Lines/Different Widths and GreyScale and Options/Size/Of lines = 0.25 This could be a chronological reading list for reviewers and students

  37. Bibexcel makes so many files…. • cocit569.tx2: text-file where LF was replaced by CR • cocit569.doc: converted to Dialog-format • cocit569.out : out-file based on CD-field • cocit569.1st : keep only author’s first initial • cocit569.low: convert to upper and lower case • cocit569.cit: frequencies • cocit569.coc: co-occurrences • cocit569.net: net-file to be open in Pajek • cocit569.vec: vec-file to be open as Vectors in Pajek • cocit569.clu: clu-file to be open as Partitions in Pajek • cocit569.vel: vertices for net-file for use by Bibexcel …. but better to have them than not!

  38. All author co-citation analysis using Scopus records“Its always better not to limit to 1st cited author as in WoS” • Get scopuscocit.ris from http://www8.umu.se/inforsk/esss/scopuscocit.ris • Select scopuscocit.ris and run Edit doc-file/Replace line feed with carriage return • Select scopuscocit.tx2 and run Misc/Convert to Dialog format/Convert from Scopus RIS format • Select scopuscocit.doc, put CD in Old tag, choose “Any ; separated field” and press Prep • Select scopuscocit.out and run Edit out-file/Scopus tools/Extract all authors from Scopus references • Select scopuscocit.sco and run Edit out-file/Decompress outfile • Select scopuscocit.nnu, choose Whole string, mark Remove duplicates and Make new out-file, and then press Start • Select scopuscocit.oux, mark Sort decending and press Start • Select scopuscocit.cit and press View file and select units down to frequencies=30, and be sure only these are in The List • Select scopuscocit.oux and run Analyze/Co-occurrences/Make pairs via list box • Select the scopuscocit.coc file and then run Mapping/Create net-file for Pajek… • Select scopuscocit.cit and run Mapping/Create vec-file • Go to Pajek and open scopuscocit.net as Network and scopuscocit.vec as Vectors • Run Draw/Draw-Vector…

  39. Draw-vector

  40. To reduce complexity minimize the draw window and then run Net/Transform/Remove/Lines with Value/lower than/ and put 10 in the box and answer yes to Make new network. After that run Draw/Draw-vectorand then ctrl-K Griffith BC would probably not show up in 1st author analysis Webometrics Go back and fix this variant!

  41. For vector graphic quality. At the Draw window runExport/2D/SVG/General and save as allauthormap.htm Get Inkscape free from http://inkscape.org/download/and open allauthormap.htm, edit and export to png-format

  42. Analyzing direct citations on Web of Science records • Select cocit569.low and run Analyze/Citations among docs/Make citation links. This will makecocit569.lin that has citing docnr in first column and cited docnr in second column. • Of course you need to label the doc numbers. Select the cocit569.ddc and double click in the box at “Type new file name here” and the path to cocit569.ddc should appear. • Select cocit569.lin and run Add data classify/Add labels to docnr-docnr pairs. Answer No to questions about swapping, self-related pairs, overlapping sets, and about writing doc numbers in addition to labels • Select cocit569.add and then run Mapping/Create net-file for Pajek and answer Yes for directed graphs! • Open cocit569.net in Pajek and Draw/Draw • You will need to reduce complexity: Run Net/Transform/Reduction/Degree/Input and set value=15. Then Draw! • If you would like to have different circle sizes: Minimize Draw window and then run Net/Vector/Summing up values of lines/Input a Vector is created that has the number of inlinks to each node. Then Draw/Draw-vector…

  43. Analyzing using Weighted Direct Citations (WDC)We can add number of shared outlinks and inlinks to each direct citation, to give each direct citation different strength • Select cocit569.lin and run Analyze/Citations among docs/ Weighted Direct Citations (WDC). The cocit569.wdc has the WDC values for each docnr-docnr pair • Again you need to label the doc numbers. Select the cocit569.ddc and double click in the box at “Type new file name here” and the path to cocit569.ddc should appear. • Select cocit569.wdc and run Add data classify/Add labels to freq-docnr-docnr/making freq-label-label. Answer No to questions about swapping, self-related pairs, and overlapping sets. • Select the cocit569.cdd file and run Edit out-file/Sort numeric/Descending by first column and you will see which are the strongest links by the WDC measure • Select cocit569.cdd and run Mapping/Create net-file for Pajek, and answer Yes for directed arcs! • In Pajek use Net/Transform/Remove/Lines with Values/Lower than=10! • Then Draw/Draw and you will see one big network component and several smaller ones and quite many isolates. You can zoom in to the bigger one by pressing right mourse button and draw. • If you go back to Pajek main window and run Net/Components/Weak and type size=20 you will get 1 component and then with Operations/Extract from network/Partition=1 you will get a new network with the big component. Then Draw that network!

  44. …further improvement by saving major component and adding new partitions and vectors • Be sure to mark the main component (with 63 nodes) • Then File/Network/Save and then overwrite cocit569.net • In Bibexcel select the cocit569.net and run Mapping/Create vel-file from net-file • Select the cocit569.ddc file and run and run Edit out-file/Extract publication year from references • Select cocit569.dpy and run Mapping/Create clu-file • Open cocit569.clu as Partition in Pajek and then Draw/Draw-partition and then Layers/In y direction • If you would like to have different circle sizes: Minimize Draw window and then run Net/Vector/Summing up values of lines/Input a Vector is created that has the sum of WDC values of inlinks to each node. Then Draw/Draw-Partition-Vector…

  45. …reduce direct citations by citation year lag • Select cocit569.cdd and run Analyze/Calculate year lags in pairs and answer Yes to add year lag values, which will come in column 1. Column 2 has a normalization (col.3 divided by col.3,) and col. 3 has the WDC value, col. 4 citing doc and col.5 cited doc. • Select cocit569.lag and to get year lags 0-2 years put 2 in Max number Box and then run Edit out-files/Delete values high frequencies • Select cocit569.max, put 3/4/5 in The Box and run Edit out-file/Select columns • Now cocit569.col has WDC values only for links no older than 2 years! • Select cocit569.col and run Mapping/Create net-file for Pajek • Go to Pajek and open the net-file and the vec-file! Removed lines with values less than 5, then Net/Componenets/Weak (min 20), then extract and save the major component to file cocit569.net • In Bibexcel, select cocit569.cdd, put 1/3 in The Box and run Edit out-files/Select columns, and then select cosit569.col and make frequencies with whole string, then cocit569.cit will have number of times a paper is cited. • In Bibexcel select cocit569.net and run Mapping/Create vel-file from net-file and then select the cocit 569.cit and run Mapping/Create vec-file • Back to Pajek and open the vec-file, and then Draw/Draw-vector

  46. Time dimension is here!

  47. …also, you can reduce co-citations by citation year lag • Select cocit569.coc and run Analyze/Calculate year lags in pairs and answer Yes to add year lag values • Select cocit569.lag and to get year lags 0-5 years put 5 in Max number Box and then run Edit out-files/Delete values high frequencies • Select cocit569.max, put 1/4/5 in The Box and run Edit out-file/Select columns • Now cocit569.col has co-citations values only for pairs no older than 5 years! • Select cocit569.col and run Mapping/Create net-file for Pajek • Also select cocit569.cit and run Mapping/Create vec-file • Go to Pajek and open the net-file and the vec-file!

More Related