我有以下查询,我想从数据库中获取100个项目,但是host_idurls表中很多次,并且我想从该表中每个。

select *
from urls
join hosts using(host_id)
where
(
    last_run_date is null
    or last_run_date <= date_sub(curdate(), interval 30 day)
)
and ignore_url != 1
limit 100


因此,我想:


最高结果= 100
每个主机的最大行数= 10


我不确定要完成此任务需要做什么。有没有子查询的方法吗?

主机表

CREATE TABLE `hosts` (
    `host_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `host` VARCHAR(50) NOT NULL,
    `last_fetched` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    `ignore_host` TINYINT(1) UNSIGNED NOT NULL,
    PRIMARY KEY (`host_id`),
    UNIQUE INDEX `host` (`host`)
)


乌尔斯表

CREATE TABLE `urls` (
    `url_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `parent_url_id` INT(10) UNSIGNED NOT NULL,
    `scheme` VARCHAR(5) NOT NULL,
    `host_id` INT(10) UNSIGNED NOT NULL,
    `path` VARCHAR(500) NOT NULL,
    `query` VARCHAR(500) NOT NULL,
    `date_found` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    `last_run_date` DATETIME NULL DEFAULT NULL,
    `ignore_url` TINYINT(1) UNSIGNED NOT NULL,
    PRIMARY KEY (`url_id`),
    UNIQUE INDEX `host_path_query` (`host_id`, `path`, `query`)
)

最佳答案

就是这样(我希望)

我不能测试我真实。我没有数据。请测试一下,然后给我一点声音。

SELECT *
  FROM (
    SELECT
      @nr:=IF(@lasthost = host_id, @nr+1, 1) AS nr,
      u.*,
      @lasthost:=IF(@lasthost = host_id, @lasthost, host_id) AS lasthost
      FROM
        urls u,
         ( SELECT @nr:=4, @lasthost:=-1 ) AS tmp
      WHERE (
            last_run_date IS NULL
            OR last_run_date <= date_sub(curdate(), INTERVAL 30 DAY)
      )
      AND ignore_url != 1
      ORDER BY host_id, last_run_date
  ) AS t
  LEFT JOIN HOSTS USING(host_id)
  WHERE t.nr < 11
  LIMIT 100;


好,

第一:

我只选择查询中的行,并对其进行排序
通过host_id和时间

SELECT
      u.*
      FROM
        urls u
         ( SELECT @nr:=4, @lasthost:=-1 ) AS tmp
      WHERE (
            last_run_date IS NULL
            OR last_run_date <= date_sub(curdate(), INTERVAL 30 DAY)
      )
      AND ignore_url != 1
      ORDER BY host_id, last_run_date


第二

我将变量nr和lasthost添加并在select中进行设置。现在
如果host_id更改,我将每行计数nr并将其重置为1。所以我得到了
每个host_id从1到n的行数的列表

选择
  @nr:= IF(@lasthost = host_id,@ nr + 1,1)AS nr,
  你*
  @lasthost:= IF(@lasthost = host_id,@ lasthost,host_id)作为lasthost
  从
    网址你,
     (选择@nr:= 4,@lasthost:=-1)AS tmp
  在哪里(
        last_run_date IS NULL
        或last_run_date   )
  AND ignore_url!= 1
  ORDER BY host_id,last_run_date

第三

我把这个查询放在一个新的选择中,这样我就可以加入第二张表,并且只对少于11行的结果进行限制,并将结果限制为100

SELECT *
  FROM (
    SELECT
      @nr:=IF(@lasthost = host_id, @nr+1, 1) AS nr,
      u.*,
      @lasthost:=IF(@lasthost = host_id, @lasthost, host_id) AS lasthost
      FROM
        urls u,
         ( SELECT @nr:=4, @lasthost:=-1 ) AS tmp
      WHERE (
            last_run_date IS NULL
            OR last_run_date <= date_sub(curdate(), INTERVAL 30 DAY)
      )
      AND ignore_url != 1
      ORDER BY host_id, last_run_date
  ) AS t
  LEFT JOIN HOSTS USING(host_id)
  WHERE t.nr < 11
  LIMIT 100;


就这样

关于mysql - 获得100行,每组最多10行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/33043157/

10-11 23:49