|
Merging Data
There will be times when a user will need to merge data from multiple sources into a single dataset. There are a number of ways to accomplish this in R as detailed in the following sections. Adding Rows from One Dataset to Another
If you have two sets of data that have the same variables (columns) but different observations (rows) you can use the rbind function to add the additional observations from one dataset to the other. Note that the variable names must match, but they do not need to be sorted in the same order. MergedData = rbind (ObjectNameA, ObjectNameB) EXAMPLE: > library(boots) > MergedData = rbind(aircondit, aircondit7) Adding Columns from One Dataset to Another (with linking variable)
If you have two sets of data that are linked by observations within a lining variable (ID/Key column) then you can add variables (columns) from one dataset to another using the merge function. A common linking observation would be a categorical variable like Name, SampleID, Country, etc. MergedData = merge (DataSetA, DataSetB, by = "LinkingVariableName1", "LinkingVariableName2",...) EXAMPLE: > MergedData = merge (state.x77, USArrests, "row.names") Adding Columns from One Dataset to Another (without linking variable)
If you have two sets of data that have the same observations (rows) but different variables (columns) without a common linking variable, you can still merge data. Using the cbind function allows you to directly add columns from one dataset to another. Note that due to the lack of a linking variable the rows must be sorted in the same order to ensure the data is merged correctly. MergedData = cbind (ObjectNameA, ObjectNameB) EXAMPLE: > MergedData = cbind(USArrests, state.area) Defining New Columns in a Dataset
A new column can be defined in a dataset using the component operator ($). The example below shows how to add a column containing 1:50 to an existing dataset. ExistingObjectName$NewColumnName = c(1:50) EXAMPLE: > USArrests$RowNumber = c(1:50) |