问题描述
我正在测试以对数据集的情绪进行分析.在这里,我试图查看在消息量和嗡嗡声,消息量和得分之间是否有任何有趣的观察...
数据集如下所示:
>str(数据)'data.frame':40磅.11个变量中:$ Date Time:POSIXct,format:"2015-07-08 09:10:00""2015-07-08 09:10:00" ...$主题:chr"MMM""ACE""AES""AFL" ...$ Sscore:chr"-0.2280""-0.4415""1.9821""-2.9335" ...$ Smean:chr"0.2593""0.3521""0.0233""0.0035" ...$ Svscore:chr"-0.2795""-0.0374""1.1743""-0.2975" ...$分散:chr"0.375""0.500""1.000""1.000" ...$卷:num 8 4 1 1 5 3 2 1 1 2 ...$ Sbuzz:chr"0.6026""0.7200""1.9445""0.8321" ...$最后关闭:chr"155.430000000""104.460000000""13.200000000""61.960000000" ...$公司名称:chr"3M公司""ACE有限公司""AES公司""AFLAC Inc."...$ Date:日期,格式:"2015-07-08""2015-07-08" ...
我想到了线性回归,所以我想使用ggplot,但是我使用了这段代码,我认为我在某个地方出错了,因为我没有出现回归线...是因为回归是为了虚弱的?我提供了以下代码的帮助:.
这是我编写ggplot代码的方式:
库(ggplot2)要求(reshape2)data.2 =融化(data [3:9],id.vars ='Svolume')ggplot(data.2)+aes(x =值,y =体积,颜色=变量)+geom_jitter()+geom_smooth(method = lm,se = FALSE,aes(group = 1))+facet_wrap(〜variable,scales ="free_x")+实验室(x =变量",y =体积")
I am testing to make an analysis of sentiment on a dataset. Here, I am trying to see if if there are any interesting observations between message volume and buzzs, message volume and scores...
There is what my dataset looks like:
> str(data)
'data.frame': 40 obs. of 11 variables:
$ Date Time : POSIXct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...
$ Subject : chr "MMM" "ACE" "AES" "AFL" ...
$ Sscore : chr "-0.2280" "-0.4415" "1.9821" "-2.9335" ...
$ Smean : chr "0.2593" "0.3521" "0.0233" "0.0035" ...
$ Svscore : chr "-0.2795" "-0.0374" "1.1743" "-0.2975" ...
$ Sdispersion : chr "0.375" "0.500" "1.000" "1.000" ...
$ Svolume : num 8 4 1 1 5 3 2 1 1 2 ...
$ Sbuzz : chr "0.6026" "0.7200" "1.9445" "0.8321" ...
$ Last close : chr "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...
$ Company name: chr "3M Company" "ACE Limited" "The AES Corporation" "AFLAC Inc." ...
$ Date : Date, format: "2015-07-08" "2015-07-08" ...
I thought about a linear regression, So I wanted to use ggplot, but I use this code and I think I got wrong somewhere as I don't have the regression lines that appears... Is it because the regression is to weak? I helped with the code from : code of topchef
Mine is:
library(ggplot2)
require(ggplot2)
library("reshape2")
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) + geom_smooth(aes(value,Svolume, colour=variable), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
But I probably missunderstood something as I don't get what I want.I am very new to R so I would love someone help me.
I have this error:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
Finally do you think it would be possible to have a different colors for the different Subjects instead of one color per variable please?Can I add the regression line on every graphs?
Thank you for your help.
Sample data:
Date Time Subject Sscore Smean Svscore Sdispersion Svolume Sbuzz Last close Company name Date
1 2015-07-08 09:10:00 MMM -0.2280 0.2593 -0.2795 0.375 8 0.6026 155.430000000 3M Company 2015-07-08
2 2015-07-08 09:10:00 ACE -0.4415 0.3521 -0.0374 0.500 4 0.7200 104.460000000 ACE Limited 2015-07-08
3 2015-07-07 09:10:00 AES 1.9821 0.0233 1.1743 1.000 1 1.9445 13.200000000 The AES Corporation 2015-07-07
4 2015-07-04 09:10:00 AFL -2.9335 0.0035 -0.2975 1.000 1 0.8321 61.960000000 AFLAC Inc. 2015-07-04
5 2015-07-07 09:10:00 MMM 0.2977 0.2713 -0.7436 0.400 5 0.4895 155.080000000 3M Company 2015-07-07
6 2015-07-07 09:10:00 ACE -0.2331 0.3519 -0.1118 1.000 3 0.7196 103.330000000 ACE Limited 2015-07-07
7 2015-06-28 09:10:00 AES 1.8721 0.0609 1.9100 0.500 2 2.4319 13.460000000 The AES Corporation 2015-06-28
8 2015-07-03 09:10:00 AFL 0.6024 0.0330 -0.2663 1.000 1 0.6822 61.960000000 AFLAC Inc. 2015-07-03
9 2015-07-06 09:10:00 MMM -1.0057 0.2579 -1.3796 1.000 1 0.4531 155.380000000 3M Company 2015-07-06
10 2015-07-06 09:10:00 ACE -0.0263 0.3435 -0.1904 1.000 2 1.3536 103.740000000 ACE Limited 2015-07-06
11 2015-06-19 09:10:00 AES -1.1981 0.1517 1.2063 1.000 2 1.9427 13.850000000 The AES Corporation 2015-06-19
12 2015-07-02 09:10:00 AFL -0.8247 0.0269 1.8635 1.000 5 2.2454 62.430000000 AFLAC Inc. 2015-07-02
13 2015-07-05 09:10:00 MMM -0.4272 0.3107 -0.7970 0.167 6 0.6003 155.380000000 3M Company 2015-07-05
14 2015-07-04 09:10:00 ACE 0.0642 0.3274 -0.0975 0.667 3 1.2932 103.740000000 ACE Limited 2015-07-04
15 2015-06-17 09:10:00 AES 0.1627 0.1839 1.3141 0.500 2 1.9578 13.580000000 The AES Corporation 2015-06-17
16 2015-07-01 09:10:00 AFL -0.7419 0.0316 1.5699 0.250 4 2.0988 62.200000000 AFLAC Inc. 2015-07-01
17 2015-07-04 09:10:00 MMM -0.5962 0.3484 -1.2481 0.667 3 0.4496 155.380000000 3M Company 2015-07-04
18 2015-07-03 09:10:00 ACE 0.8527 0.3085 0.1944 0.833 6 1.3656 103.740000000 ACE Limited 2015-07-03
19 2015-06-15 09:10:00 AES 0.8145 0.1725 0.2939 1.000 1 1.6121 13.350000000 The AES Corporation 2015-06-15
20 2015-06-30 09:10:00 AFL 0.3076 0.0538 -0.0938 1.000 1 0.7071 61.440000000 AFLAC Inc. 2015-06-30
dput
data <- structure(list(`Date Time` = structure(c(1436361000, 1436361000,
1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000,
1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400,
1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800
), class = c("POSIXct", "POSIXt"), tzone = ""), Subject = c("MMM",
"ACE", "AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE",
"AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", "AES",
"AFL"), Sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977,
-0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247,
-0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076
), Smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519,
0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274,
0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), Svscore = c(-0.2795,
-0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796,
-0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481,
0.1944, 0.2939, -0.0938), Sdispersion = c(0.375, 0.5, 1, 1, 0.4,
1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833,
1, 1), Svolume = c(8L, 4L, 1L, 1L, 5L, 3L, 2L, 1L, 1L, 2L, 2L,
5L, 6L, 3L, 2L, 4L, 3L, 6L, 1L, 1L), Sbuzz = c(0.6026, 0.72,
1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536,
1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656,
1.6121, 0.7071), `Last close` = c(155.43, 104.46, 13.2, 61.96,
155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38,
103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `Company name` = c("3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc."), Date = structure(c(16624,
16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622,
16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601,
16616), class = "Date")), .Names = c("Date Time", "Subject",
"Sscore", "Smean", "Svscore", "Sdispersion", "Svolume", "Sbuzz",
"Last close", "Company name", "Date"), row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20"), class = "data.frame")
Note the warning Maybe you want aes(group = 1)
. All I've done is add group = 1
to aes
for geom_smooth
.
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) +
geom_smooth(aes(value,Svolume, colour=variable, group = 1), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
Some unsolicited advice
You don't need to use
require
andlibrary
, one or the other.You only need
aes
onceYour example data didn't work - I had to fiddle with it to read it. See How to make a great R reproducible example? for advice.
Here's how I would write the ggplot code:
library(ggplot2)
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
aes(x = value, y = Svolume, colour = variable) +
geom_jitter() +
geom_smooth(method=lm, se=FALSE, aes(group = 1)) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
这篇关于使用ggplot2对R中的数据集进行多元线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!