210 likes | 320 Vues
The Variable Explosion. Or how the DDI variable spread out to inhabit multiple modules in DDI 3.0. Once there was a mild mannered DDI 1.0 variable. <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1" fileid="WLT1"/> <qstn ID="Q5" seqNo="4">
E N D
The Variable Explosion Or how the DDI variable spread out to inhabit multiple modules in DDI 3.0
Once there was a mild mannered DDI 1.0 variable <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1" fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1“ fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1“ fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1“ fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1“ fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1“ fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1" fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
It was well contained and kept close to home <var ID="V11" name="V11" catQnty="2"> <location StartPos="25" EndPos="25" width="1" RecSegNo="1" fileid="WLT1"/> <qstn ID="Q5" seqNo="4"> <qstnLit>What country were you born in?</qstnLit> </qstn> <labl level="var">Nativity</labl> <catgry ID="CV11_1"> <catValu>1</catValu> <labl level="catgry">Native</labl> <catStat type=“freq”>798920</catStat> </catgry> <catgry ID="CV11_2"> <catValu>2</catValu> <labl level="catgry">Foreign</labl> <catStat type=“freq”>210023</catStat> </catgry> <concept source="archive">Place_of_Birth</concept> <derivation><drvdesc>If US code as 1, else code as 2</drvdesc></derivation> </var>
By 2.1 it has gotten a bit bolder • You could arrange the variables into nCubes creating multidimensional structures • It now had an optional place to put data location information • In allowed for more explicit nesting and category structure
Modularity ruled! • Concepts were captured early • Questions developed and linked to concepts • Categories were defined… • …and grouped into specific relationships • Variables were created using these category groups • Data storage was designed • Physical instances of data files were created and summary statistics calculated
Fortunately they all hung on to their “handies” and stayed in touch…
Implications for creating documentation • You can create in the order that information becomes available • You can reuse pieces and imply relationships • Support community wide concept, question, category and variable banks • Create new physical instances or formats without changing other modules
It requires • Thinking about a variable as a development process rather than as an artifact • New tools to facilitate information capture and create the appropriate links
Categories • Individual labels and definitions • Comparability is by definition NOT by label • Can be used by multiple Category Groups
Category Groups • A category group is made up of 1 or more categories • Flat – no hierarchies • Hierarchical - Levels • Regular • Irregular • Provides specific category codes
Variable construction • Variable • No categories • Ranges • Open • Category Groups • Full • Level • Discrete • Range • Cherry pick
NCube • They grew up, got a capital N • Still composed of one or more variables • Provide for multiple measures • Gained features to make them comparable to SDMX structures
What have we got • Logical product • Categories • Category Groups [name may change to protect the innocent] • Assembled into variables • Variables assembled into NCubes • Concepts, questions, physical location and summary stats are in other modules