580 likes | 1.57k Vues
Pseudo Random and Random Numbers. Vivek Bhatnagar and Chaitanya Cheruvu. Contents. Introduction to Pseudorandom Numbers Theory Behind Pseudorandom Numbers Some Pseudorandom Number Generators Attacks on Pseudorandom generators Tests for pseudorandom functions
 
                
                E N D
Pseudo Random and Random Numbers Vivek Bhatnagar and Chaitanya Cheruvu
Contents • Introduction to Pseudorandom Numbers • Theory Behind Pseudorandom Numbers • Some Pseudorandom Number Generators • Attacks on Pseudorandom generators • Tests for pseudorandom functions • True Random generators • Conclusions
Introduction • Truly random - is defined as exhibiting ``true'' randomness, such as the time between ``tics'' from a Geiger counter exposed to a radioactive element • Pseudorandom - is defined as having the appearance of randomness, but nevertheless exhibiting a specific, repeatable pattern. • numbers calculated by a computer through a deterministic process, cannot, by definition, be random
Introduction • Given knowledge of the algorithm used to create the numbers and its internal state (i.e. seed), you can predict all the numbers returned by subsequent calls to the algorithm, whereas with genuinely random numbers, knowledge of one number or an arbitrarily long sequence of numbers is of no use whatsoever in predicting the next number to be generated. • Computer-generated "random" numbers are more properly referred to as pseudorandom numbers, and pseudorandom sequences of such numbers.
Introduction • Usage • Almost all network security protocols rely on the randomness of certain parameters • Nonce - used to avoid replay • session key • Unique parameters in digital signatures • Monte Carlo Simulations - • is a mathematical technique for numerically solving differential equations. Randomly generates scenarios for collecting statistics.
Introduction • (Desirable) Properties of Pseudorandom Numbers • Uncorrelated Sequences - The sequences of random numbers should be serially uncorrelated • Long Period - The generator should be of long period (ideally, the generator should not repeat; practically, the repetition should occur only after the generation of a very large set of random numbers). • Uniformity - The sequence of random numbers should be uniform, and unbiased. That is, equal fractions of random numbers should fall into equal ``areas'' in space. Eg. if random numbers on [0,1) are to be generated, it would be poor practice were more than half to fall into [0, 0.1), presuming the sample size is sufficiently large. • Efficiency - The generator should be efficient. Low overhead for massively parallel computations.
Almost all random number generators have as their basis a sequence of pseudorandom integers The integers or ``fixed point'' numbers are manipulated arithmetically to yield floating point or ``real'' numbers. The Nature of the cycle the sequence has a finite number of integers the sequence gets traversed in a particular order the sequence repeats if the period of the generator is exceeded the integers need not be distinct; that is, they may repeat. The Random Number Cycle
Introduction • Testing Pseudorandom generators • clever algorithms have been developed which generate sequences of numbers which pass every statistical test used to distinguish random sequences from those containing some pattern or internal order. • Tests to check the different properties discusses above. • Tests include mean and variance checks. Mean should be close to 0.5 and variance 1/12 = 0.08 for uniformly distributed pseudorandom numbers.
Theory • Computational Indistinguishability - Consider an ensemble (i.e. a sequence of numbers). Now if we can generate an ensemble which cannot be differentiated from the first ensemble in polynomial time by any efficient procedure , the second ensemble is, for all practical purposes, equivalent to the first ensemble. • Ensembles that are computationally Indistinguishable from a uniform ensemble are called pseudorandom. • General definition of Pseudorandom numbers • A deterministic polynomial time algorithm which satisfies the following 2 conditions
Theory • Computational Indistinguishability - Consider an ensemble (i.e. a sequence of numbers). Now if we can generate an ensemble which cannot be differentiated from the first ensemble in polynomial time by any efficient procedure , the second ensemble is, for all practical purposes, equivalent to the first ensemble. • Ensembles that are computationally Indistinguishable from a uniform ensemble are called pseudorandom. • General definition of Pseudorandom numbers • A deterministic polynomial time algorithm which satisfies the following 2 conditions • Expansion: for every s E {0,1}* , |G(s)| > |s| • Pseudo randomness: the ensemble { G(s) } is pseudorandom
Theory • Significance of Pseudorandom Generators • Efficient amplifiers/expanders of randomness. • Using very little randomness (a randomly chosen seed) they produce very long sequences which look random to any efficient observer. • Pseudorandom generators allow to produce high quality random sequences at low costs making them very useful in cryptography. • They produce unpredictable sequences i.e. no efficient algorithm can guess its next bit given a prefix of the sequence.
Theory • One-way functions and Pseudorandom numbers • The existence of one is necessary and sufficient condition for the existence of the other. • Some one-way functions • RSA Function • Discrete Logarithm
Theory • Some practical constructions based on collections of permutations • The intractability of the Discrete Logarithm Problem : based on the fact that it is hard to predict, given a prime P, a primitive element G, and an element Y of the group, whether there exists 0 < x < P/2 such that Y = G^x mod P. • The difficulty of inverting RSA : based on the fact that the least significant bit constitutes a hard-core for the RSA collection. • The intractability of Factoring Blum Integers: based on the fact that the least significant bit constitutes a hard-core for the Rabin collection
Linear Congruential Generators • We begin by discussing the linear congruential generator - the one most commonly used for generating random integers • we generate the next random integer using the previous random integer , the integer constants, and the integer modulus • To get started, the algorithm requires an initial ``seed'', which must be provided by some means. • We refer to the sequence generated as • The appearance of randomness is provided by performing modulo arithmetic or remaindering • Note that the next result, , depends upon only the previous integer - This is a characteristic of linear, congruential generators which minimizes storage requirements, but at the same time, imposes restrictions on the period.
Linear Congruential Generators • With Xn determined, we generate a corresponding real number as follows: • When dividing by m Rn , the values are then distributed on [0,1). • We desire uniformity, where any particular Rn is just as likely to appear as any other Rn , and the average of the Rn is very close to 0.5.
Linear Congruential Generators • Example 1 LCG (5, 1, 16, 1) • Let us consider a simple example with a= 5, c=1, m=16, and X0 =1. The sequence of pseudorandom integers generated by this algorithm is: 1,6,15,12,13,2,11,8,9,14,7,4,5,10,3,0,1,6,15,12,13,2,11,8,9,14, ..
Linear Congruential Generators • We observe : • The period (the number of integers before the sequence repeats) P is 16 - exactly equal to the modulus, m. Thus, for m=16 , this sequence is of long period (the longest possible), and uniform (it completely fills the space of integers from 0-15). • sequence exhibits throughout its period the pattern of alternating odd and even integers. • It is readily apparent that the sequence is serially correlated. Due to this lack of randomness, the values should not be used as random digits. The real numbers generated from the integer sequence are generally sufficiently random in the higher order (most significant) bits to be used in many application codes.
Linear Congruential Generators • Next, we infer the following. Because each random integer results from the previous integer alone, selecting any initial seed from 0 to 15 would just cyclically shift the above sequence. • Thus, all that a different choice of the initial seed does is shift the starting point in the sequence already determined by a, c and m • Finally, we note that the average of the real numbers is 0.4688 and the variance is 0.0830. The departure of these values from the ideal ones of 1/2 and 1/12 is due to the short period of this sequence and the rather coarse resolution of the generated real numbers. These conditions of average and variance approaching the theoretical values are necessary but not sufficient conditions for a good random number generator.
Linear Congruential Generators • Example 2 LCG (5, 0, 16, 1) • Next, we take the case of c = 0 . This is termed a multiplicative congruential random number generator:
Linear Congruential Generators • Observations • the low order bits are not random. • the sequence is correlated, as all successive integers differ by 4 from their predecessors. • At coarse granularity, the sequence is uniform. For example, if we divide [0,1) equally into quarter segments, then exactly one random number falls into each segment: [0, 0.25), [0.25, 0.5), [0.5, 0.75) and [0.75,1). However, at finer granularity, this uniformity breaks down - consider dividing up the domain into 8 equal segments, for example. • There are two separate issues to consider here. • the finite precision existing in all computers, which results in a round-off error to the precision with which integers can be represented, or with which the floating point divide is accomplished. • the interaction of the sequence of random numbers produced by our generator with our application. This is particularly troublesome when an application requires n-tuples of random numbers, instead of just one random number at a time
Linear Congruential Generators • Initial Seed • When debugging, it is important to implement the algorithm to reproduce the same stream of random numbers on successive runs. • the initial seed should be set to a ``random'' odd value Eg.
Linear Congruential Generators • Characteristics of good LCGs Pseudo random number generators • A large value of a is desirable to provide sufficient randomness. • A large value of m is also desired, so that the period is kept long. • Summary of the salient features and the recommendations • Multiplicative, congruential generators are adequate to good for many applications. They are not acceptable... for high-dimensional work.. • They can be very good if speed is a major consideration. Prime modulo are best. However, modulo of the form are faster on binary computers.
R250 • Uses a shift register sequence. • Has several advantages over a linear congruential generator • Long period 2^249 • Period does not depend upon the number of bits used in the random number generator • Generally much faster than an LCM implementation • Generator is built from a one bit random generator based on the following equation. • The max period is 2^(p-1). We will use the value of p =250 .
R250 • Choosing most of the ci terms to be 0 we get the equation. • If we choose q = 103 then the number generated is got by adding the previously calculated 103rd bit and 250th bit • To generate a random number of 16 or 32 bit s. This can be done by doing the above 1 bit addition for each bit in the desired random number. • Since exclusive-or is the same as bitwise addition all the bit operations can be don in parallel. This gives the speed advantage.
Shuffling Numbers • Sometimes it is desirable to randomize a small set of numbers so that a non-repeating sequence is obtained. • Games • Oceanographic RAFOS float • It is Important not to repeat numbers. Taking the modulus of a generator like r250 will not work as the numbers could repeat. • One way to do this would be to put the value to be shuffled into an array and to use a random number generator to generate indices into the array to actually shuffle the numbers. The array is then accessed sequentially.
Quasi Random Numbers • For some applications pseudo random numbers are a little too random. • Some portions of the domain are relatively under sampled and other portions are over sampled. • Quasi Random number generators maintain a uniform density of coverage over the entire domain by giving up serial independence of subsequenctly generated value in order to obtain a uniform coverage of the domain.
Cryptanalytic Attacks on Random Number Generators • Examples of random parameters in cryptography: • Session keys • Numbers to be hashed with passwords • Parameters in digital signatures • Nonces • Most of the above are approximated using PRNGs • For true randomness: • Noise in electrical circuits • Radioactive decay etc.
Classes of Attacks on PRNGs: • Direct Cryptanalytic Attack: • When the attacker can directly distinguish between PRNG numbers and random numbers (cryptanalyze the PRNG). • Input Based Attack: • When the attacker is able to use knowledge and control of PRNG inputs to cryptanalyze the PRNG. • State Compromise Extension Attacks: • When the attacker can guess some information due to an earlier breach of security. The advantage of a previous attack is extended.
Direct Cryptanalytic Attacks: • When the attacker can directly cryptanalyze the PRNG. • Applicable to most PRNGs • Not applicable when the attacker is not able to directly see the output of the PRNG. • Eg A PRNG used to generate triple-DES keys. Here the output of the PRNG is never directly seen by an attacker.
Input Based Attacks: • When an attacker used knowledge or control of the inputs to cyptanalyze the PRNG output. • Types: • Known Input • If the inputs to the PRNG, that are designed to be difficult for a user to guess, turn out to be easily deducible. Eg disk latency time. When the user is accessing a network disk, the attacker can observe the latency time. • Chosen input • Practical against smartcards, applications that feed incoming messages (username/password etc) to the PRNG as entropy samples. • Replayed Input • Similar to chosen input, except it requires less sophistication on the part of the attacker.
State Compromise Extension Attacks: • Attempts to extend the advantages of a temporary security breach • These breaches can be: • Inadvertent leak • Previous cryptographic success • This attack is successful when: • The attacker learns the internal state of the system at state S and it’s: • Able to recover unknown PRNG outputs from before S was compromised. OR • Recover outputs from after a PRNG has collected a sequence of inputs that an attacker cannot otherwise guess. • These attacks usually succeed when the system is started in guessable state (due to lack of entropy):
State Compromise Extension Attacks (cont): • These attacks are classified as: • Backtracking attacks • Uses the compromise of PRNG state S to learn about all previous PRNG outputs. • Permanent compromise attack • Once S has been compromised, all future and past outputs of the PRNG are vulnerable. • Iterative guessing attacks • Uses the knowledge of state S that was compromised at time t and the intervening PRNG outputs to guess the state S’ at time t+Δ. • Meet-in-the-middle attacks • Combination of iterative guessing and backtracking.
Some Examples: • X 9.17 PRNG: • Vulnerable to Input based attack and state compromise extension attacks. • DSA PRNG: • Vulnerable only to state compromise extension attacks. • RSAREF PRNG: • Vulnerable to Input based attack and state compromise extension attacks.
Tests for Randomness in Random Numbers: • Quantitative tests: • Χ2 tests: • Lagged Correlation: • Qualitative tests: • Scatter Plots • Plot pairs of random numbers. • Clumps of numbers, gaps and patterns are easily visible. • Random Walk
Χ2 tests: • Measure how well the presumed distribution (usually uniform) is represented. • Algorithm for the test: • Divide the whole interval, within which the random number would be into finite number of bins (class intervals). Assume they have same size. • Count the number of random numbers within each interval and calculate the “expected” number of observations [(number of random numbers used) / (number of class intervals) for uniform intervals]. • Calculate: Χ2 = Σ(i=1,m)(observedi – expectedi)2 / (expectedi) • The value of Χ2 determines if the numbers generated represent a chosen distribution, by looking up in a table, some critical values of Χ2.
Lagged Correlation: • This test reveals the relationship between the numbers at one time and at another (autocorrelation). • Reveals trends and periodicity of numbers. • Properties of an ideal random number generator: • Autocorrelation value = 1; for lag (τ)=0 • Autocorrelation value = 0; for any other value of τ • If the autocorrelation values slowly drop to 0 as τ increases, then the random numbers generated are not very independent of each other.
Random Walk • Algorithm: • Divide the range of the random number generator into equal intervals. • (Divide into 4 intervals for a random walk in two dimensions) • Generate a number, if the number falls in: • First interval, increment X • Second interval, increment Y • Third interval, decrement X • Fourth interval, decrement Y • Generate t steps for a random walk for n walks • Calculate the means squared distance reached • Plot this distance against time • A plot for several values of t and distance should roughly be linear. • Else the random numbers are not correctly distributed.
Truly Random Numbers: • Must rely on external physical quantities • Computers require special hardware • Few computers have access to this kind of hardware • Example: Sensors (heat/pressure) etc. • Randomness without relying on external data: • Some way to measure internal activity of the computer such that the activity is quantifiable and genuinely random. • Example: Timing of keystrokes as a user enters a password.
Some physical quantities used in real world for true random number generation: • Timing of keystrokes when a user enters a password. • Measurement of air turbulence due to the movement of hard drive heads. • Timings of memory accesses under artificially induced thrashing conditions. • Precise measurement of current leakage from a CPU or any other system component. • Measurement of timing skew between two systems timers: • A hardware timer • A software timer
Conclusions: • Random number are the basis for many cryptographic applications. • There is no reliable “independent” function to generate random numbers. • Present day computers can only approximate random numbers, using pseudo-random numbers generated by Pseudo Random Number Generators (PRNG)s. • Attacks on many cryptographic applications are possible by attacks on PRNGs. • Computer applications are increasingly turning towards using physical data (external/internal) for getting truly random numbers.