170 likes | 553 Vues
Combining Data on a Common Variable. Merging: combining records from two data sets based on a common variable.General form of the data step:data SAS-data-set;merge SAS-data-sets;by variable(s);SAS-statementsrun;Data sets must be sorted.. Example. Merge monthly data on actual revenue (per
                
                E N D
1. Merging SAS Data Sets STT 305 
2. Combining Data on a Common Variable Merging:  combining records from two data sets based on a common variable.
General form of the data step:
data SAS-data-set;
	merge SAS-data-sets;
	by variable(s);
	SAS-statements
run;
Data sets must be sorted. 
3. Example Merge monthly data on actual revenue (performance) to data on goals (goals).
The code:
data compare;
	merge stt305.performance stt305.goals;
	by month;
run; 
4. Compilation data compare;
	merge stt305.performance stt305.goals;
	by month;
run;
performance				goals 
5. Execution data compare;
	merge stt305.performance stt305.goals;
	by month;
run;
performance				goals 
6. Execution data compare;
	merge stt305.performance stt305.goals;
	by month;
run;
performance				goals 
7. Non-Matches Merge the following data sets by employee ID number:
	Crewlist				Sched 
8. Execution data schedule;
	merge stt305.crewlist 	 	stt305.sched;
	by empID;
run; 
9. Execution data schedule;
	merge stt305.crewlist 	 	stt305.sched;
	by empID;
run; 
10. Execution data schedule;
	merge stt305.crewlist 	 	stt305.sched;
	by empID;
run; 
11. Skipping Non-Matches Suppose I only wanted to report crew members who were scheduled in previous example.
One possibilityIN= option
SAS-data-set(IN=variable)
Variable is logical (technically, numeric)
0 indicates no contribution from that data set 
1 indicates a contribution from that data set
 
12. Modified Code data schedule;
	merge crew schedule(in=InSched);
	by empID;
	if InSched=1;
run;
Only outputs observations for which the by variable was present in the schedule data. 
13. One-to-Many Merges We wish to merge data sets containing information on accounts and transactions on those accounts.
accounts			transactions 
14. data records;
	merge accounts transactions;
	by account_no;
run; 
15. data records;
	merge accounts transactions;
	by account_no;
run; 
16. data records;
	merge accounts transactions;
	by account_no;
run; 
17. One-to-Many Merges By this logic, how will the rest of the merge proceed?
Does the order of the listing of the data sets in the merge statement make a difference? 
18. Many-to-Many Merge What would happen if I tried to merge these two data sets?
     salesstaff			clients