Thomas Krichel 2009-03-13 RePEc as frontier repository, the business model and what it means to survive as network in a more and more web-collaborative academia and a developing semantic web
acknowledgements • Marcus Dejardin for suggesting me to come here. • ERIM for sponsoring my stay • Wilfred Mijnhard for suggesting the title.
sorry • I did not have access to powerpoint, so the slides may not be as clean as one would hope. • Jetlag means I may not be as sharp as normal.
dissecting the title I • RePEc as frontier repository I am not sure what that is, but start talking about RePEc, using a historical approach • the business model and what it means to survive as network I think the hint is to sustainability. Sustainability can be achieved as a network.
dissecting the title II • in a more and more web-collaborative academia the web may not make academia itself more collaborative, but it be used for a collaborative scholarly communication infrastructure • and a developing semantic web I am still a bit unclear what the semantic web is.
overview • historical introduction to RePEc • sustainability discussion • RePEc opportunities • the semantic web
RePEc History • It started with me as a research assistant an in the Economics Department of Loughborough University of Technology in 1990. • a predecessor of the Internet allowed me to download free software without effort • but academic papers had to be gathered in a painful way
CoREJ • published by HMSO • Photocopied lists of contents tables recently published economics journal received at the Department of Trade and Industry • Typed list of the recently received working papers received by the University of Warwick library • The latter was the more interesting.
working papers • early accounts of research findings • published by economics departments • in universities • in research centers • in some government offices • in multinational administrations • disseminated through exchange agreements • important because of 4 year publishing delay
1991-1992 • I planned to circulate the Warwick working paper list over listserv lists • I argued it would be good for them • increase incentives to contribute • increase revenue for ILL • After many trials, Warwick refused. • During the end of that time, I was offered a lectureship, and decided to get working on my own collection.
1993: BibEc and WoPEc • Fethy Mili of Université de Montréal had a good collection of papers and gave me his data. • I put his bibliographic data on a gopher and called the service "BibEc" • I also gathered the first ever online electronic working papers on a gopher and called the service "WoPEc".
NetEc consortium • BibEc printed papers • WoPEc electronic papers • CodEc software • WebEc web resource listings • JokEc jokes • HoPEc a lot of Ec!
WoPEc to RePEc • WoPEc was a catalog record collection • WoPEc remained largest web access point • but getting contributions was tough • In 1996 I wrote basic architecture for RePEc. • ReDIF • Guildford Protocol
1997: RePEcprinciple • Many archives • archives offer metadata about digital objects (mainly working papers) • One database • The data from all archives forms one single logical database despite the fact that it is held on different servers. • Many services • users can access the data through many interfaces. • providers of archives offer their data to all interfaces at the same time. This provides for an optimal distribution.
US Fed in Print IMF OECD MIT University of Surrey CO PAH Elsevier based on close to 1000 archives • WoPEc • EconWPA • DEGREE • S-WoPEc • NBER • CEPR • Blackwell
to form a 721k item dataset • 284,000 working papers • 430,000 journal articles • 1,700 software components • 5,200 book and book chapters • 19,000 author contact and publication listings • 11,000 institutional contact listings
IDEAS RuPEc EDIRC LogEc CitEc MPRA RePEc is used in many services • Econpapers • Economists Online • NEP: New Economics Papers • OAI-PMH gateway • RePEc Author Service
… describes documents • Template-Type: ReDIF-Paper 1.0 • Title: Dynamic Aspect of Growth and Fiscal Policy • Author-Name: Thomas Krichel • Author-Person: RePEc:per:1965-06-05:thomas_krichel • Author-Email: T.Krichel@surrey.ac.uk • Author-Name: Paul Levine • Author-Email: P.Levine@surrey.ac.uk • Author-WorkPlace-Name: University of Surrey • Classification-JEL: C61; E21; E23; E62; O41 • File-URL: ftp://www.econ.surrey.ac.uk/ pub/RePEc/sur/surrec/surrec9601.pdf • File-Format: application/pdf • Creation-Date: 199603 • Revision-Date: 199711 • Handle: RePEc:sur:surrec:9601
… describes persons (RAS) • template-type: ReDIF-Person 1.0 • name-full: MANKIW, N. GREGORY • name-last: MANKIW • name-first: N. GREGORY • handle: RePEc:per:1984-06-16:N__GREGORY_MANKIW • email: email@example.com • homepage:http://post.economics.harvard.edu/faculty/ • mankiw/mankiw.html • workplace-institution: RePEc:edi:deharus • workplace-institution: RePEc:edi:nberrus • Author-Article: RePEc:aea:aecrev:v:76:y:1986:i:4:p:676-91 • Author-Article: RePEc:aea:aecrev:v:77:y:1987:i:3:p:358-74 • Author-Article: RePEc:aea:aecrev:v:78:y:1988:i:2:p:173-77 • ….
… describes institutions • Template-Type: ReDIF-Institution 1.0 • Primary-Name: University of Surrey • Primary-Location: Guildford • Secondary-Name: Department of Economics • Secondary-Phone: (01483) 259380 • Secondary-Email: firstname.lastname@example.org • Secondary-Fax: (01483) 259548 • Secondary-Postal: Guildford, Surrey GU2 5XH • Secondary-Homepage: • http://www.econ.surrey.ac.uk/ • Handle: RePEc:edi:desuruk
nature of RePEc • RePEc is not a service, it is a library dataset. • The library is freely reusable. • Re-users of RePEc data make the augmented data available. • A positive feedback mechanism is born. • An example is NEP (see Marcus' talk).
business model of RePEc • The business model of RePEc is similar to open source. • In fact RePEc can be thought of as an application of open source coding to library-like metadata. • But the aim of RePEc is not centered on research users.
aim of RePEc • RePEc is focused on the needs of research suppliers rather than research users. • The end use of RePEc data generates evaluative data. • Yes, without end use there is no evaluative data but the end use is only a means to an end. • This is very difficult to understand.
example I • Take some RePEc data • NEP generates classification data. • Authors claim papers in RePEc • Authors say where they work • It is then possible to set up an automated mapping between institutions and subject areas.
example II • Take some RePEc data • Authors claim papers in RePEc • It is then possible to build a collaboration infrastructure, indicating centrality of authors. • In fact I have software to build a collaboration discovery service. • But I have no server to host it on.
example III • Take some RePEc data • CitEc has citation data • Authors claim papers • It is then possible to build a ranking of influential papers, not only by direct citation, but also by citations to citing papers, and rank authors by the citation output
one key factor emerges • It is most important that authors and institutions are registered in a reusable registration framework. • This is a service to be build for all disciplines • I am work on this • ariw.org • authorclaim.org
sustainable • Authors and their institutions see that a public record is available of their achievements. • That record is available in many places. • They rush to improve their record. • Most contributors to RePEc don't work as volunteers, but as part of their professional duties.
the coordinators • There are about 10 people who spend quite a bit of their time on RePEc. • They provide crucial functions to the whole. Their contributions go way beyond what would be expected within their professional settings. • There is no formal list of responsibilities and no command and control structure. The RePEc-run mailing list is used to communicate.
hosting • Hosting is a critical issue for RePEc services. • A number of RePEc machines are based on very informal agreement. • Moving to acknowledged hosting seems difficult. Libraries, for example, talk the talk but don't walk the walk.
opportunities in preservation • RePEc has currently no preservation strategy. • RePEc could be a model case for digital preservation of born digital material.
opportunities in peer review • RePEc has made a timid foray into peer review with NEP. • There could be other service that could rely on people forming some judgment about papers. • It has been very difficult to get a substantial amount of expert onto a business model of such a service.
peer review through assertions • One thing that could be done with the Knewco technology is build assertions about relationships in papers. • This process can be helped by a keywords catalog that has been manually gathered within the RePEc data, via the keywords of papers. • People who are registered with the RePEc Author Service could login to validate claims.
plagiarism detection • There has been plagiarism detection written for arXiv at Cornell, see arXiv eprint cs/0702012. • There is a collection of full-text RePEc papers from the CitEc project. • The plagiarism detection software could be used on it.
the semantic web. • Recently a UC Berkeley lead project to build a bibliographic knowledge network has been picking up RePEc data, including CitEc, NEP, and LogEc.
Thank you for your attention! http://openlib.org/home/krichel