Understanding Data Types in Programming
270 likes | 381 Vues
Explore various data types in programming including integers, floats, characters, arrays, strings, and more. Learn allocation strategies, operations, errors, and complexities of different types.
Understanding Data Types in Programming
E N D
Presentation Transcript
Primitives Aggregates • Integer • Float • Character • Boolean • Pointers • Strings • Records • Enumerated • Arrays • Objects
Strings B e t s y b b b • Fixed length. • Null terminated. • Length field. • Heap allocated. B e t s y 0 5 B e t s y B e t s y
String Allocation • Static length. • Blank fill as in Fortran, Pascal, etc. • Limited Dynamic length • grows to a limit • Dynamic length • no length restriction • reallocates from heap char x[4]; strcpy(x,”abc”); //OK strcat(x,”def”); // NO x=“abc”; x=“abcdefghij”;
Implementation • Can be viewed as primitive type • some machine language supports string operations at a level which treats them as primitives even though operations are slower • Sometimes requires both • compile-time descriptors • run-time descriptors • know the difference
Enumerated types • Usually implemented as integers. • Implied size limitation which is not a problem • (red, green, blue) red is 0, green is 1, etc • Strong typing sometimes creates ambiguity • desire types to be distinguished but for • weekday = (Mon, Tue, Wed, Thur, Fri); • classday = (Mon, Wed, Fri) • assignment ok one direction, but not other • I/O sometimes allowed, others not
Subrange • Sequence of an ordinal type • Mon..Fri • Used for tighter restriction of values than primitive types provide • subtype age is integer 0..150; • Sometimes compatible, others not • EXAMPLE: is age compatible with integer? • type age is new integer range 1..150; NO • type age is integer range 1..150; YES
Array Operations • ARRAY operations are infrequent except APL • Examples • elements (common) • entire array (as parameters/pointers) • slice (a row, column, or series of rows/columns) • APL • matrix multiplcation • vector dot product • add a scalar to each element
Allocation strategies • Static array • Fixed stack-dynamic • int x[20]; compile-time decision of size allocation • Stack dynamic • int x[n]; once allocated, size can’t change, but determined by n • Heap dynamic • array can grow dynamically and change subscript • ever been frustrated by the MAX size of array?
Subscript/subrange errors • Subscript bounds problems for arrays are one of our biggest programming nuisances • Checking for them at run-time is expensive • Even if within the range -> no assurance they are correct • Some languages such as c do NO checking • Consequence in programs is difficult/impossible to trace
Addressing • Storage is row-major or column-major order int A[2,3]; (1,1) (1,2) (1,1) (1,1) (1,2) (2,1) (2,1) (2,2) (2,1) (3,1) (3,1) (3,2) (2,2) (1,2) (3,1) (2,2) (3,2) (3,2)
Determining location Location (a[I]) = base address (a)+ (I- lowerbound)*element size 100 integer a[6]; [1] Assume size 4 bytes each starting at 100 [2] 104 108 [3] 112 [4] Loc(a[3])= 100 + (3-1)*4 = 108 116 [5] [6] 120 Most of this is compile-time!
2-d arrays (column major) Loc (a[I,J]) = base address (a) (I-lb1)*size element + (J-lb2)*size of column size of column=number rows allocated * size element 100 (1,1) 104 (2,1) 108 (3,1) 112 Loc (a[1,2]) = 100 + (1-1)*4 + (2-1)*3*4 = 100 + 0 + 12 = 112 (1,2) 116 (2,2) 120 (3,2)
Passing 2-d arrays as parameters • The receiving procedure needs to have DIMENSION information • Some languages are tightly bound and force that .. Pascal by requiring it to be a declared type • Others have strange rules • Fortran (column major) Called: SUBROUTINE PROCESS(A,N) INTEGER A(N,1) Caller: INTEGER A(10,20) CALL PROCESS(A,10)
Associative arrays • Not common… in perl • Uses a hash function • Stores Key and Value “gary” 47850 hash %salaries In math class: hash(key) = value or hash(“gary”)=47850 mary 55750 cedric 75000 gary 47850 %salaries{“gary”} -> 47850 perry 57000
Arrays as pointers in c • Use of array name in c is the same as a pointer to the beginning element • Incrementing the associated pointer increments by the true memory size • integers are 4 bytes • int * j; • j++; // increments j by 4.. assuming byte addressable
Example code in c Assign j to be the address of c[0] As long as the address of j is within the bounds of c int c[10], *j; for (j=c; j<&c[10]; j++) { *j = 0; } Increment j by size of integer Set the element to 0 for (int j=0; j<10; j++) { c[j] = 0; }
Records • Record operations • assignment • comparison • block operations without respect to fields • Strange syntax in c • Unions
Record pointers in c In declaring routine: teacher.age=35; Struct person{ int weight; int age; char name[20]; }; // not exact format person teacher; When passing to function and inside function: teacher->age=35;
Unions • Free unions • two names for the same place • it’s up to you to keep them straight • no support for checking • Discriminated unions • a value in the record indicates how to interpret the associated data. • Not always easy to check.. Sometimes not done
Ada example (p.231) rectangle:side1,side2 circle:diameter triangle:leftside, rightside, angle Discriminant(form) color filled
Sets • Bit fields implemented as binary values (below) • fast implementation • set operations are easy binary operations • try set union • limit to size of set related to binary ops Type colors = (red,blue,green,yellow,orange,white,black); colorset = set of colors; var set1 : colorset; set1 := [red,orange,blue]; implemented as ( 1 1 0 0 1 0 0 )
Pointers • Lots of flexibility • Data from heap • Difficult to manage what you are pointing at • Many languages strongly manage the types to which the pointers point • c doesn’t care • c++ does • Real problems are programmer management
Pointer problems Dangling reference: int *p1, *p2; p1 = new (int); p2=p1; delete(p1); Lost heap-dynamic: int *p1, *p2; p1 = new (int); p1 = p2; (lost) p1 p1 p2 p2
Handling Pointer Problems • Tombstones • always stays even after memory deallocated • never have a variable pointing at deallocated data Before cell After null cell tombstone
Handling Pointer Problems REFERENCE COUNTERS 3 pointers at same cell 2 pointers at same cell 3 2 cell cell Delete cell when reference count is 0 Other than efficiency, trick is with circular lists
Handling Pointer Problems GARBAGE COLLECTION Mark all w/0 Mark all pointed at w/1 Initial scenario 1 0 0 0 0 1 0 1 0 1