1 / 49

Lotkaian Informetrics and applications to social networks

This article explores the concepts of 1-dimensional and 2-dimensional informetrics, including the growth patterns and size-frequency functions in various fields. It also discusses the laws of Lotka and Zipf, and their application to linguistics and economics. The article further delves into the fractal nature of Lotkaian information production processes (IPPs) and the construction of self-similar fractals. It concludes with the relevance of Lotkaian Informetrics to concentration theory, fractional modeling of authorship, and the dynamics of networks.

rcastro
Télécharger la présentation

Lotkaian Informetrics and applications to social networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt UniversityProfessor Antwerp UniversityEditor-in-Chief “Journal of Informetrics” leo.egghe@uhasselt.be

  2. 1-dimensional informetrics • # authors in a field • # journals in a field • # articles in a field • # references (or citations) in a field • # borrowings in a library • # websites, hosts, … • # web citations to a paper • # in- (or out-) links to/from a website • # downloads of an article

  3. Growth Exponential growth All “new” fields grow exponentially Otherwise there is S-shaped growth.

  4. # web servers versus time

  5. 2- dimensional informetrics • # authors in a field (sources) • # articles in a field (items) • + indicating which author has written which papers S = Set of sources I = set of items IPP = Information Production Process

  6. Examples of IPPs

  7. = size-frequency function: for n = 1,2,3,… = # sources with n items • = rank-frequency function: for r = 1,2,3,… = # items in the source on rank r (sources are ranked in decreasing order of number of items they have)

  8. Continuous model Source densities Item densities

  9. Lotkaian Informetrics The law of Lotka and the law of Zipf Lotka (1926) . The value is a turning point in informetrics (see further).

  10. Lotka’s law is equivalent with Zipf’s law : Linguistics Zipf’s law in econometrics is called Pareto’s law

  11. Dependence of G on . Existence of a Groos droop if .

  12. log-log scale = decreasing straight line with slope =

  13. Rank-frequency distributions for websites

  14. The scale-free property f : scale-free such that

  15. Theorem (i)⇔(ii): • f is continuous, decreasing and scale-free • f is a decreasing power function: such that i.e. Lotka’s law

  16. Explanation of Lotka’s law based on exponential growth of sources and items (Naranan (1970)) and an interpretation of Lotkaian IPPs as self-similar fractals (Egghe (2005)) Fractals and fractal dimension

  17. Divide a line piece into 3 equal parts ⇒ we need 3=31 line pieces of this length to cover the original line piece :3 ⇒ need 3=31 ⇒ dim=1

  18. Divide the sides of a square into 3 equal parts ⇒ we need 9=32 squares with this side length to cover the original square :3 ⇒ need 9=32 ⇒ dim=2 • The same for a cube :3 ⇒ need 27=33 ⇒ dim=3

  19. Construction of the triadic Koch curve

  20. For the triadic Koch curve :3 ⇒ need 4=3D ⇒ dim=D with The Koch curve is a proper fractal with fractal dimension = Complexity theory = Fractal theory Mandelbrot

  21. Naranan (Nature, 1970) Theorem: (i) The number of sources grows exponentially in time t: (ii) The number of items in each source grows exponentially in time (iii) The growth rate in (ii) is the same for every source: (ii) and (iii) together imply a fixed exponential function for the number of items in each source at time t.

  22. Then this IPP is Lotkaian, i.e. the law of Lotka applies: if f(p) denotes the number of sources with p items, we have where

  23. Egghe (2005) (Book and JASIST) (i) The number of line pieces grows exponentially in time t, here proportional with 4t (ii),(iii) 1/length of each line piece grows exponentially in time t and with the same growth rate 3. Hence we have growth proportional with 3t.

  24. Rephrased in terms of informetrics: a (Lotkaian) IPP is a self-similar fractal and its fractal dimension is given by the logarithm of the growth rate of the sources, divided by the logarithm of the growth rate of the items. (which can be > or < 1). Hence, the exponent in Lotka’s law satisfies the important relation: This result was earlier seen by Mandelbrot but only in the context of (artificial) random texts (hence in linguistics).

  25. Further applications of Lotkaian Informetrics • Concentration theory (inequality theory): Lorenz curves (cf. econometrics). Egghe (2005) (Book, Chapter IV). • Fractional modelling of authorship (case of multi-authored articles): determine = # authors with articles (fractional counting: an author in an m-authored paper receives a score ).

  26. Theoretical and experimental fractional frequency distributions (case of i=4).

  27. Dynamics of Lotkaian IPPs, described via transformations on the sources and on the items: includes the description of dynamics of networks. Relations with 3-dimensional informetrics: See new journal: L. Egghe. General evolutionary theory of IPPs and applications to the evolution of networks. Journal of Informetrics 1(2), 115-122, 2007

  28. Item transformation Source transformation New rank-frequency function

  29. Theorem: New size-frequency function where

  30. Case is example of “linear 3 dimensional informetrics” Sources1 → Items1 = Sources2 → Items2 Examples: • Webpages → hyperlinks → use of hyperlinks • Library subject categories → books → borrowings See further. Back to the general case.

  31. Power law transformations in Lotkaian IPPs

  32. Theorem: is only dependent on b/c due to the scale-free nature of Lotkaian systems.

  33. Corollary: With this, one can study the evolution of an IPP, e.g. a part of WWW: V. Cothey (2007): confirms theory except in one case where non-Lotkaian evolution is found, probably due to “automatic” creation of web pages (deviation from a social network).

  34. Further application: IPPs without low productive sources (Egghe and Rousseau (2006)) Take : sources remain but they grow in number of items: Now

  35. and (since ) Evolution: decreasing Lotka exponent and no low productive sources

  36. Examples • Country sizes: data from www.gazetteer.de (July 10, 2005): 237 countries : = 1.69 (best fit) • Municipalities in Malta (1997 data): 67 municipalities: = 1.12 (best fit) • Database sizes: on the topic “fuzzy set theory” (20 largest databases on this topic) (Hood and Wilson (2003)): = 1.09 (best fit) • Unique documents in databases (20 databases above): =1.33 (best fit).

  37. Application of Lotka’s law to the modelling of the cumulative first-citation distribution i.e. the distribution over time at which an article receives its first citation.

  38. The time t1 at which an article receives its first citation is an important indicator of the visibility of research. • At t1 the article switches its status from “unused” to “used”. • t1 is a measure of immediacy but, of course, different from the immediacy index (Thomson Scientific).

  39. The distribution of t1 over a group of articles is the topic of the present study. We will study the cumulative first-citation distribution = cumulative fraction of all papers that have, at t1, at least 1 citation.

  40. Rousseau (1994) uses two different differential equations to model two types of graphs: a concave one and an S-shaped one. These equations are not explained and are not linked to any informetric distribution.

  41. In Egghe (2000), I use only 2 elementary informetric tools : = the density function of citations to an article, t time after its publication (exponential, ), = the density function of the number of papers with A citations in total (Lotka, ), (only ever cited papers are used here).

  42. Normalizing to distributions : becomes for an article with A citations in total becomes but we will use the fraction of ever cited articles, in order to include also the never cited articles.

  43. Theorem : • concave if • S-shaped if , hence explaining both shapes in one model. Note the turning point of .

  44. Proof : A first citation is received if (*) ⇒ Cumulative fraction of all articles that are already cited at time t1: (**) ⇒(*) into (**) yields

  45. Motylev (1981)

  46. fit :

  47. Rousseau (1994) JACS to JACS data of Rousseau Time-unit = 2 weeks, 4-year period fit :

More Related