【正文】
55% 60% 65% 70% 75% 80% ? ? 85% 90% 95% 100% ? 例 1 讀取某公司雇員數(shù)據(jù)( R數(shù)據(jù)文件) ,分析收入的數(shù)據(jù)特征。 ? fivenum(Edata$SALARY) ? [1] 15750 24000 28875 37050 135000 ? IQR(Edata$SALARY) ? [1] ? summary(Edata$SALARY) Min. 1st Qu. Median Mean 3rd Qu. Max. 15750 24000 28880 34420 36940 135000 例 2 分析公司不同性別、是否少數(shù)民族、工作類型條件下收入的數(shù)據(jù)特征。 ? tapply(Edata$SALARY,Edata$GENDER,mean) f m ? tapply(Edata$SALARY,Edata$JOBCAT,mean) 經(jīng)理 保管員 服務(wù)員 ? tapply(Edata$SALARY,Edata$MINORITY,mean) ? Yes No ? ? tapply(Edata$SALARY,Edata$GENDER,fivenum) $f [1] 15750 21525 24300 28500 58125 $m [1] 19650 28050 32850 50550 135000 例 2 分析公司不同性別、是否少數(shù)民族、工作類型條件下收入的數(shù)據(jù)特征。 ? tapply(Edata$SALARY,Edata$JOBCAT,fivenum) 經(jīng)理 [1] 保管員 [1] 24300 30150 30750 30975 35250 服務(wù)員 [1] 15750 22800 26550 31200 80000 ? tapply(Edata$SALARY,Edata$MINORITY,fivenum) $Yes [1] 16350 23625 26625 30675 100000 $No [1] 15750 24150 29925 40350 135000 例 3分析公司不同性別及工作類型條件下收入的數(shù)據(jù)特征。 ? tapply(Edata$SALARY,list(Edata$JOBCAT,Edata$GENDER),mean) ? tapply(Edata$SALARY,list(Edata$JOBCAT,Edata$GENDER),fivenum) ? y ? y[1,1] ? attributes(y) 對公司雇員數(shù)據(jù) , 分析不同性別及民族之間的收入、收入增長(目前工資與起始工資差)數(shù)據(jù)特征,求出主要統(tǒng)計量( mean, IQR,fivnum,range,var,std); 寫出分析報告 。 作業(yè) : 要求:需給出程序、結(jié)果,存成 word文檔 發(fā)送到 用戶名: r 密碼: 123456 ? tapply package:base R Documentation Apply a Function Over a Ragged Array ? Description: Apply a function to each cell of a ragged array, that is to each (nonempty) group of values given by a unique bination of the levels of certain factors. ? Usage: tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE) ? Arguments: X: an atomic object, typically a vector. INDEX: list of factors, each of same length as 39。X39。. FUN: the function to be applied. In the case of functions like 39。+39。, 39。%*%39。, etc., the function name must be quoted. If 39。FUN39。 is 39。NULL39。, tapply returns a vector which can be used to subscript the multiway array 39。tapply39。 normally produces. ? ...: optional arguments to 39。FUN39。. simplify: If 39。FALSE39。, 39。tapply39。 always returns an array of mode 39。list39。. If 39。TRUE39。 (the default), then if 39。FUN39。 always returns a scalar, 39。tapply39。 returns an array with the mode of the scalar. ? Value: When 39。FUN39。 is present, 39。tapply39。 calls 39。FUN39。 for each cell that has any data in it. If 39。FUN39。 returns a single atomic value for each cell (., functions 39。mean39。 or 39。var39。) and when 39。simplify39。 is 39。TRUE39。, 39。tapply39。 returns a multiway array containing the values. The array has the same number of dimensions as 39。INDEX39。 has ponents。 the number of levels in a dimension is the number of levels (39。nlevels()39。) in the corresponding ponent of 39。INDEX39。. Note that contrary to S, 39。simplify = TRUE39。 always returns an array, possibly 1dimensional. ? If 39。FUN39。 does not return a single atomic value, 39。tapply39。 returns an array of mode 39。list39。 whose ponents are the values of the individual calls to 39。FUN39。, ., the result is a list with a 39。dim39。 attribute. Note that optional arguments to 39。FUN39。 supplied by the 39。...39。 argument are not divided into cells. It is therefore inappropriate for 39。FUN39。 to expect additional arguments with the same length as 39。X39。. ? References: Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S Language_. Wadsworth amp。 Brooks/Cole. ? See Also: the convenience functions 39。by39。 and 39。aggregate39。 (using 39。tapply39。)。 39。apply39。, 39。lapply39。 with its versions 39。sapply39。 and 39。mapply39。. ? Examples: require(stats) groups (rbinom(32, n = 5, p = .4)) tapply(groups, groups, length) is almost the same as table(groups) contingency table from : array with named dimnames tapply(warpbreaks$breaks, warpbreaks[,1], sum) tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum) n 17。 fac factor(rep(1:3, len = n), levels = 1:5) table(fac) ? Examples: tapply(1:n, fac, sum) tapply(1:n, fac, sum, simplify = FALSE) tapply(1:n, fac, range) tapply(1:n, fac, quantile) example of ... argument: find quarterly means tapply(presidents, cycle(presidents), mean, = TRUE) ind list(c(1, 2, 2), c(A, A, B)) table(ind) tapply(1:3, ind) the split vector tapply(1:3, ind, sum)