|
Filtering Data
There will be times when a user will need to filter data before generating visualizations or performing statistical analyses. This can be accomplished via the subset function. A few examples are provided in the following sections. Filtering dataset based on a variable meeting a defined condition
Filter a dataset based on the value of single variable (column) can be achieved via the subset function. In this function, the new filtered dataset only includes rows that yield a "TRUE" result from the filter expression. FilteredData = subset ( UnfilteredObjectName, Filter Expression) EXAMPLE: > FilteredData = subset ( DNase, density > 1.5 ) or > FilteredData = subset ( CO2, Type == "Mississippi" ) Filtering dataset based on variables meeting multiple defined conditions
Multiple filter expressions can be defined in a single subset function. This allows a user to filter a dataset based on multiple variables (columns). In this function, the new filtered dataset only includes rows that yield a "TRUE" result from all of the filter expressions. FilteredData = subset ( UnfilteredObjectName, Filter Expression 1 & Filter Expression 2 & Filter Expression 3 ... ) EXAMPLE: > FilteredData = subset ( DNase, density > 1.5 & conc == 6.25 ) or > FilteredData = subset ( CO2, Type == "Mississippi" & Treatment == "chilled") Filtering dataset based on variables meeting one of multiple defined conditions
Multiple filter expressions can be defined in a single subset function. This allows a user to filter a dataset based on multiple variables (columns). In this function, the new filtered dataset only includes rows that yield a "TRUE" result from at least one of the filter expressions. FilteredData = subset ( UnfilteredObjectName, Filter Expression 1 | Filter Expression 2 | Filter Expression 3 | ... ) EXAMPLE: > FilteredData = subset ( DNase, density > 1.5 | conc == 6.25 ) or > FilteredData = subset ( CO2, Type == "Mississippi" | Treatment == "chilled") |