为什么SELECT *被认为有害？

本文介绍了为什么SELECT *被认为有害？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

为什么 SELECT * 不好的做法？

我知道 SELECT COUNT（*）是一些数据库的性能问题，但是如果你真的想要每一个列呢？

I understand that SELECT COUNT(*) is a performance problem on some DBs, but what if you really wanted every column?

推荐答案

主要原因：

向消费者传送数据时效率低下。经常从数据库中检索更多的列，而不是你的应用程序真正需要的功能。这会导致更多的数据从数据库服务器移动到客户端，从而减慢访问速度并增加计算机上的负载，以及花费更多的时间在网络上传输。

Inefficiency in moving data to the consumer. When you SELECT *, you're often retrieving more columns from the database than your application really needs to function. This causes more data to move from the database server to the client, slowing access and increasing load on your machines, as well as taking more time to travel across the network. This is especially true when someone adds new columns to underlying tables that didn't exist and weren't needed when the original consumers coded their data access.

绑定问题。当您使用SELECT *时，可以从两个不同的表中检索相同名称的两列。这可能会使您的数据消费者崩溃。想象一下，连接两个表的查询，两个表都包含一个名为ID的列。消费者如何知道是哪个？当基础表结构更改时，SELECT *也可能混淆视图（至少在某些版本的SQL Server中） - 。最糟糕的部分是，你可以小心地为你的列命名任何你想要的，但下一个人谁会来可能不知道，他不得不担心添加一个列，将碰撞你已经开发

Binding Problems. When you SELECT *, it's possible to retrieve two columns of the same name from two different tables. This can often crash your data consumer. Imagine a query that joins two tables, both of which contain a column called "ID". How would a consumer know which was which? SELECT * can also confuse views (at least in some versions SQL Server) when underlying table structures change -- the view is not rebuilt, and the data which comes back can be nonsense. And the worst part of it is that you can take care to name your columns whatever you want, but the next guy who comes along might have no way of knowing that he has to worry about adding a column which will collide with your already-developed names.

但这并不是对SELECT *对于这些用例，我非常使用它：

But it's not all bad for SELECT *. I use it liberally for these use cases:

东西，特别是一个狭窄的表，我可能不熟悉，SELECT *通常是我最好的朋友。它帮助我看看发生了什么，而不需要做大量的研究，基本的列名是什么。

Ad-hoc queries. When trying to debug something, especially off a narrow table I might not be familiar with, SELECT * is often my best friend. It helps me just see what's going on without having to do a boatload of research as to what the underlying column names are. This gets to be a bigger "plus" the longer the column names get.

当*表示一行时。以下使用案例，SELECT *只是很好，传言，它是一个性能杀手只是城市传说，可能在许多年前有一定的效力，但现在不：

When * means "a row". In the following use cases, SELECT * is just fine, and rumors that it's a performance killer are just urban legends which may have had some validity many years ago, but don't now:

SELECT COUNT(*) FROM table;

，*表示计数行。如果您要使用列名称而不是*，则会计入该列值不为空的行。 COUNT（*），对我来说，真正驱动着你计数行的概念，并且避免从您的聚合中消除NULL引起的奇怪的边缘情况。

in this case, * means "count the rows". If you were to use a column name instead of * , it would count the rows where that column's value was not null. COUNT(*), to me, really drives home the concept that you're counting rows, and you avoid strange edge-cases caused by NULLs being eliminated from your aggregates.

此类型的查询也是如此：

Same goes with this type of query:

SELECT a.ID FROM TableA a
WHERE EXISTS (
    SELECT *
    FROM TableB b
    WHERE b.ID = a.B_ID);

在任何有价值的数据库中，*只是意味着一行。不管你放在子查询中什么。有些人在SELECT列表中使用b的ID，或者他们将使用数字1，但IMO这些约定是非常荒谬的。你的意思是计数行，这就是*表示。大多数查询优化器都有足够聪明的知道这一点。（尽管老实说，我只是知道这是真正的SQL Server和Oracle。）

in any database worth its salt, * just means "a row". It doesn't matter what you put in the subquery. Some people use b's ID in the SELECT list, or they'll use the number 1, but IMO those conventions are pretty much nonsensical. What you mean is "count the row", and that's what * signifies. Most query optimizers out there are smart enough to know this. (Though to be honest, I only know this to be true with SQL Server and Oracle.)

这篇关于为什么SELECT *被认为有害？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！