《学习R》笔记：科学计算器、检查变量和工作区、向量、矩阵和数组、列表和数据框

一、第二章1765243235 科学计算器

要检查两个数字是否一样，要使用 all.equal() ,不要使用 == ，== 符号仅用于比较两个整型数是否存在相同。

> all.equal(sqrt(2)^2,2)

[1] TRUE

> all.equal(sqrt(2) ^ 2,3)

[1] "Mean relative difference: 0.5"

> isTRUE(all.equal(sqrt(2) ^ 2,2))

[1] TRUE

> isTRUE(all.equal(sqrt(2) ^ 2,3))

[1] FALSE

二、第三章 检查变量和工作区

变量的类：逻辑类(logical)、三个数值的类(numeric、complex、integer)、用于存储文本的字符character、存储类别数据的因子factor，以及较罕见的存储二进制数据的原始值raw

factor因子，存储类别数据

> gender = factor(c("male","female","male","female"))

> gender

[1] male female male female

Levels: female male

> levels(gender)

[1] "female" "male"

> nlevels(gender)

[1] 2

在底层，因子的值被存储为整数，而非字符。可以通过调用 as.integer() 清楚的看到

1 2	`>` `as.integer(gender)` `[1] 2 1 2 1`

事实证明，采用整数而非字符文本的存储方式，令内存的使用非常高效

> gender_char = sample(c("female","male"),1000,replace = TRUE)

> gender_char

......

> gender_fac = as.factor(gender_char)

> #把数据的类型转换为因子型

> object.size(gender_char)#object.size()函数返回对象的内存大小

8160 bytes

> object.size(gender_fac)

4560 bytes

把因子转换为字符串

1 2	`>` `as.character(gender)` `[1]` `"male"` `"female"` `"male"` `"female"`

改变一个对象的类型(转型casting)

> x = "123.456" #使用as*函数改变x的类型

> as.numeric(x) #as(x,"numeric")

[1] 123.456

> is.numeric(x)

[1] FALSE

代码 options(digits = n) 设置全局变量确定打印数字的小数点位数。

> options(digits = 10)

> (x = runif(5))

[1] 0.040052175522 0.544388080016 0.506369658280

[4] 0.144690239336 0.005838404642

runif 函数将生成30个均匀分布于0和1之间的随机数，summary 函数就不同的数据类型提供汇总信息，例如对数值变量：

> num = runif(30)

> summary(num)

Min. 1st Qu. Median Mean

0.001235794 0.199856233 0.475356185 0.475318138

3rd Qu. Max.

0.703412558 0.984893506

letters、LETTERS 是两个内置的常数

> letters

[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"

[13] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x"

[25] "y" "z"

> LETTERS

[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L"

[13] "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X"

[25] "Y" "Z"

sample 函数为抽样函数，它的格式为：sample( x , size= , replace= ) 第三个参数的缺省值是F ，表示进行的是无放回抽样。

对a~e重复随机抽样30次：

> fac = factor(sample(letters[1:5],size = 30,replace = T))

> summary(fac)

a b c d e

4 7 2 5 12

> bool = sample(c(TRUE,FALSE,NA),30,replace = TRUE)

> summary(bool)

Mode FALSE TRUE NA's

logical 10 8 12

创建数据框dfr ，这里只显示他的前几行

> dfr = data.frame(num,fac,bool)

> head(dfr) #默认显示前6行

num fac bool

1 0.34019507235 b NA

2 0.77415443189 e TRUE

3 0.02201034524 d TRUE

4 0.11190012516 e NA

5 0.18030911358 a NA

6 0.98489350639 d TRUE

> summary(dfr)

num fac bool

Min. :0.001235794 a: 4 Mode :logical

1st Qu.:0.199856233 b: 7 FALSE:10

Median :0.475356185 c: 2 TRUE :8

Mean :0.475318138 d: 5 NA's :12

3rd Qu.:0.703412558 e:12

Max. :0.984893506

str 函数能显示对象的结构。对向量来说，它并非很有趣(因为它们太简单了)，但 str 对数据框和嵌套列表非常有用：

> str(num)

num [1:30] 0.34 0.774 0.022 0.112 0.18 ...

> str(dfr)

'data.frame': 30 obs. of 3 variables:

$ num : num 0.34 0.774 0.022 0.112 0.18 ...

$ fac : Factor w/ 5 levels "a","b","c","d",..: 2 5 4 5 1 4 1 4 1 5 ...

$ bool: logi NA TRUE TRUE NA NA TRUE ...

每个类都有自己的打印(print)方法，以此控制如何显示到控制台。又是，这种打印模糊了其内部结构，或忽略了一些有用的信息。用unclass函数可绕开这一点，显示变量是如何构建的。例如，对因子调用 unclass 函数会显示它仅是一个整数(integer) 向量，拥有一个叫 levels 的属性：

unclass(fac)

[1] 2 1 4 3

attr(,"levels")

[1] "cat" "dog" "goldfish" "hamster"

attributes 函数能显示当前对象的所有属性列表：

> attributes(fac)

$levels

[1] "cat" "dog" "goldfish" "hamster"

$class

[1] "factor"

view 函数会把数据框显示为电子表格。edit 和 fix 与其相似，不过它们允许手动更改数据值。

View(dfr) #不允许更改

new_dfr = edit(dfr) #更改将保存于new_dfr

fix(dfr) #更改将保存于dfr

1	`View(head(dfr,50))` `#查看前50行`

三、第四章 向量、矩阵和数组

数组能存放多维矩形数据。矩阵是二维数组的特例。

有很多创建序列的方法，seq创建的优点是可设置步长。

1 2	`> (xulie =` `seq(1,15,2))` `[1] 1 3 5 7 9 11 13 15`

length() 函数查询序列的长度：

1 2	`>` `length(xulie)` `[1] 8`

向量的命名：

> c(apple = 1,banana = 2,"kiwi fruit" = 3, 4)

apple banana kiwi fruit

1 2 3 4

> x = 1:4

> names(x) = c("apple" ,"banana" ,"kiwi fruit","")

> x

apple banana kiwi fruit

1 2 3 4

数组的创建：

> three_d_array = array( #三维数组

+ 1:24,

+ dim = c(4,3,2),

+ dimnames = list(

+ c("one","two","three","four"),

+ c("ein","zwei","drei"),

+ c("un","deux")

+ )

> three_d_array

, , un

ein zwei drei

one 1 5 9

two 2 6 10

three 3 7 11

four 4 8 12

, , deux

ein zwei drei

one 13 17 21

two 14 18 22

three 15 19 23

four 16 20 24

> (a_matrix = matrix( #创建矩阵

+ 1:12,

+ nrow = 4,byrow = T,

+ dimnames = list(

+ c("one","two","three","four"),

+ c("ein","zwei","drei")

+ )

+ ))

ein zwei drei

one 1 2 3

two 4 5 6

three 7 8 9

four 10 11 12

一些函数：

> x = (1:5) ^ 2

> x

[1] 1 4 9 16 25

> x[c(1,3,5)]

[1] 1 9 25

> x[c(-2,-4)]

[1] 1 9 25

> x[c(TRUE,F,T,F,T)]

[1] 1 9 25

> names(x) = c("one","four","nine","sixteen","twenty five")

> x

one four nine sixteen twenty five

1 4 9 16 25

> which(x > 10)

sixteen twenty five

4 5

> which.min(x)

one

1

> which.max(x)

twenty five

5

>

> rep(1:5 , 3)

[1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

> rep(1:5 , each = 3)

[1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5

> rep(1:5 , times = 1:5)

[1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5

> rep(1:5 , length.out = 7)

[1] 1 2 3 4 5 1 2

> rep.int(1:5 , 3)

[1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

> rep_len(1:5 , 13)

[1] 1 2 3 4 5 1 2 3 4 5 1 2 3

> dim(three_d_array)

[1] 4 3 2

> dim(a_matrix)

[1] 4 3

> nrow(a_matrix)

[1] 4

> ncol(a_matrix)

[1] 3

N

2019最新稳赢《幸运飞艇公式北京赛车56码走势技巧冠军》经验规律图解

《学习R》笔记：科学计算器、检查变量和工作区、向量、矩阵和数组、列表和数据框