问题描述
我有一个包含两行的表:IMDB_ID和Actor.我正在寻找一对共同出演过3部或以上电影的演员.这对名称应该是唯一的,这意味着演员A,演员B"和演员B,演员A"是同一对,因此只能出现其中的一个.这是表格的几行,但不是全部:
I have a table with two rows: IMDB_ID and Actor. I am trying to find the pairs of actors who co-stared in 3 or more movies. The pairs of names should be unique, meaning that ‘actor A, actor B’ and ‘actor B, actor A’ are the same pair, so only one of them should appear. Here's a few lines of the table, but not the whole thing:
IMDB_ID ACTOR
---------- -----------
tt0111161 Tim Robbins
tt0111161 Morgan Free
tt0111161 Bob Gunton
tt0111161 William Sad
tt0111161 Clancy Brow
tt0111161 Gil Bellows
tt0111161 Mark Rolsto
tt0111161 James Whitm
tt0111161 Jeffrey DeM
tt0111161 Larry Brand
tt0111161 Neil Giunto
tt0111161 Brian Libby
tt0111161 David Prova
tt0111161 Joseph Ragn
tt0111161 Jude Ciccol
tt0068646 Marlon Bran
tt0068646 Al Pacino
我尝试过:
SELECT DISTINCT movie_actor.actor, movie_actor.actor, COUNT(movie_actor.actor) AS occurrence
FROM movie_actor join movie_actor
ON movie_actor.imdb_id = movies.imdb_id
WHERE occurrence >= 3
GROUP BY movie_actor.actor
ORDER BY occurrence DESC, movie_actor.actor ASC;
并收到以下操作错误消息:列名不明确:movie_actor.actor
and received an operational error that said: ambiguous column name: movie_actor.actor
下面的最终可行解决方案(*请注意单向不等式运算符,该运算符可防止反向重复对):
The final, working solution is below (*notice the unidirectional inequality operator, which prevents reverse duplicate pairs):
SELECT DISTINCT ma.actor, ma2.actor, COUNT(*) AS occurrence
FROM movie_actor ma join movie_actor ma2
ON ma.imdb_id = ma2.imdb_id
WHERE ma.actor > ma2.actor
GROUP BY ma2.actor, ma.actor
HAVING occurrence >= 3
ORDER BY occurrence DESC, ma2.actor ASC
推荐答案
在使用自联接时,应该使用表别名.但是,您的查询存在许多问题,表明您需要编写SQL的更多实践.一些好的习惯:
You should be using table aliases when using a self-join. However, your query has numerous problems, indicating that you need more practice in writing SQL. Some good habits:
- 总是使用表别名的缩写.
- 始终限定列名
- 请记住,当您打算使用
having
子句时.
- Always use table aliases that are abbreviations of the table name.
- Always qualify the column names
- Remember to use a
having
clause when that is your intention.
这里的SQL更接近您想要做的事情:
Here is SQL closer to what you want to do:
SELECT ma.actor, ma2.actor, COUNT(*) AS occurrence
FROM movie_actor ma join
movie_actor ma2
ON ma.imdb_id = ma2.imdb_id
GROUP BY ma.actor, ma2.actor
HAVING COUNT(*) >= 3
ORDER BY occurrence DESC, ma.actor ASC;
这篇关于在SQL中选择将表联接到自身的不同对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!